TY - GEN
T1 - Automatic unsupervised bug report categorization
AU - Limsettho, Nachai
AU - Hata, Hideaki
AU - Monden, Akito
AU - Matsumoto, Kenichi
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/12/4
Y1 - 2014/12/4
N2 - Background: Information in bug reports is implicit and therefore difficult to comprehend. To extract its meaning, some processes are required. Categorizing bug reports is a technique that can help in this regard. It can be used to help in the bug reports management or to understand the underlying structure of the desired project. However, most researches in this area are focusing on a supervised learning approach that still requires a lot of human afford to prepare a training data. Aims: Our aim is to develop an automated framework than can categorize bug reports, according to their hidden characteristics and structures, without the needed for training data. Method: We solve this problem using clustering, unsupervised learning approach. It can automatically group bug reports together based on their textual similarity. We also propose a novel method to label each group with meaningful and representative names. Results: Experiment results show that our framework can achieve performance comparable to the supervised learning approaches. We also show that our labeling process can label each cluster with representative names according to its characteristic. Conclusion: Our framework could be used as an automated categorization system that can be applied without prior knowledge or as an automated labeling suggestion system.
AB - Background: Information in bug reports is implicit and therefore difficult to comprehend. To extract its meaning, some processes are required. Categorizing bug reports is a technique that can help in this regard. It can be used to help in the bug reports management or to understand the underlying structure of the desired project. However, most researches in this area are focusing on a supervised learning approach that still requires a lot of human afford to prepare a training data. Aims: Our aim is to develop an automated framework than can categorize bug reports, according to their hidden characteristics and structures, without the needed for training data. Method: We solve this problem using clustering, unsupervised learning approach. It can automatically group bug reports together based on their textual similarity. We also propose a novel method to label each group with meaningful and representative names. Results: Experiment results show that our framework can achieve performance comparable to the supervised learning approaches. We also show that our labeling process can label each cluster with representative names according to its characteristic. Conclusion: Our framework could be used as an automated categorization system that can be applied without prior knowledge or as an automated labeling suggestion system.
KW - automated bug report categorization
KW - cluster labeling
KW - clustering
KW - topic modeling
UR - http://www.scopus.com/inward/record.url?scp=84920507485&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920507485&partnerID=8YFLogxK
U2 - 10.1109/IWESEP.2014.8
DO - 10.1109/IWESEP.2014.8
M3 - Conference contribution
AN - SCOPUS:84920507485
T3 - Proceedings - 2014 6th International Workshop on Empirical Software Engineering in Practice, IWESEP 2014
SP - 7
EP - 12
BT - Proceedings - 2014 6th International Workshop on Empirical Software Engineering in Practice, IWESEP 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2014 6th International Workshop on Empirical Software Engineering in Practice, IWESEP 2014
Y2 - 12 November 2014 through 13 November 2014
ER -