[Internship] QA Data Analyst Internship Project: Using Text Mining to Find Similar Bugs in the Past
Introduction
I had an opportunity to work as a data analyst intern for a game-making company this summer. I worked in the Quality Assurance team, which was relatively larger than QA teams in other industries because of the nature of the game-making company.
In this post, I would like to share how I came up with the idea for the project and how text mining could help solve the existing inefficiencies in the bug process.
Background
Currently, a bug goes through a series of processes from the occurrence of a bug and its closing.
- QA Tester: Enters bug information
- QD: Enters/Examines bug information
- SE Lead: Assigns an SE to the bug
- SE: Fixes the bug
- QA Tester: Final review of the bug and Closes the ticket
Issues with the Current Process
In the bug process above, QA Tester and SE Lead do pre-investigations
- QA Tester: try different filters and type in different keywords to find if there have been any similar bugs in the past which is time consuming.
- SE Lead: checks capacity and competency of SE and assigns the bug to the SE who is available and seems most suitable.
Project Goal
Our goal was to build a web application that finds similarities between a new and bugs in the past which will bring about the following benefits:
- QA Testers can determine whether or not to create a ticket for a new bug based on its similarities to bugs in history
- Find bugs in order of its similarities to the new bug
- Reduces bug comparison time
- Avoids duplicate bug tickets.
- SE Lead can assign the bug to an SE who has solved similar bugs in the past which
- Saves SE Lead’s time thinking about which SE to assign
- Increase Bug Fix Rate by assigning to an SE who has solved similar bugs in the past
- Reduce bug fixing time by assigning to an SE who has solved similar bugs in the past
Web Application
Technologies used:
- R shiny
- Text mining (KoNLP, dplyr, tidytext, stringr)
- JavaScript
Full screen shot
Key functions:
Auto-compmlete search: This is where a user enters a bug keyword or BugID
Color Scheme: Coloring bugs with the same characteristics (Feature, TypeofBug,) make it easy to find similar bugs
Second Search Box: A search box for the dataframe allows users to narrow down to find a specific bug with certain key words. Note that it can search through the comments
Improvements
- The app does not handle typos well currently. For example, Google can handle typos like“unifomr” and automatically generate results for “uniform”. Since the bug descriptions are written in Korean, I was not able to find libraries that would enable such settings. If we can accumulate enough data about which words users are likely to make typos, we would expect to solve this issue to some extent.
In Retrospect
- Through the experience of leading a project from scratch to the end, I realized the high standards a product must meet to be published in a industry setting and the importance of earning buy-ins from stakeholders.
- It was fun having informational interviews with potential users from diverse backgrounds to get feedbacks and incorporating those feedback into the application.
- I enjoyed learning about the process of how bugs are generated, reported to our system, fixed and finally closed by collaborative efforts of professionals across the teams.
- I really appreciate my manager, colleagues, and intern friends who helped me throughout the journey. My mentor was always there for me to ask questions like where to locate resources or who to ask for questions I had. I would like to thank my manager as well for granting me permission to explore a variety of data, attend virtual meetings at other studios, and lead projects that I proposed. Everyone was very willing to take time to answer my questions and put up with my curiosity. I not only learned and practiced the technical skills to become a good data analyst, but I also learn how important it is to build a good relationship with my colleagues with whom I spend 1/3 of my day! Not that I had a bad relationship with anyone, it’s just really fun working with people with whom you can trust.