A heuristic rule reduction approach to software fault-proneness prediction

Akito Monden, Jacky Keung, Shuji Morisaki, Yasutaka Kamei, Ken Ichi Matsumoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Background: Association rules are more comprehensive and understandable than fault-prone module predictors (such as logistic regression model, random forest and support vector machine). One of the challenges is that there are usually too many similar rules to be extracted by the rule mining. Aim: This paper proposes a rule reduction technique that can eliminate complex (long) and/or similar rules without sacrificing the prediction performance as much as possible. Method: The notion of the method is to removing long and similar rules unless their confidence level as a heuristic is high enough than shorter rules. For example, it starts with selecting rules with shortest length (length=1), and then it continues through the 2nd shortest rules selection (length=2) based on the current confidence level, this process is repeated on the selection for longer rules until no rules are worth included. Result: An empirical experiment has been conducted with the Mylyn and Eclipse PDE datasets. The result of the Mylyn dataset showed the proposed method was able to reduce the number of rules from 1347 down to 13, while the delta of the prediction performance was only. 015 (from. 757 down to. 742) in terms of the F1 prediction criteria. In the experiment with Eclipsed PDE dataset, the proposed method reduced the number of rules from 398 to 12, while the prediction performance even improved (from. 426 to. 441.) Conclusion: The novel technique introduced resolves the rule explosion problem in association rule mining for software proneness prediction, which is significant and provides better understanding of the causes of faulty modules.

Original languageEnglish
Title of host publicationProceedings - Asia-Pacific Software Engineering Conference, APSEC
PublisherIEEE Computer Society
Pages838-847
Number of pages10
Volume1
ISBN (Print)9780769549224
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event19th Asia-Pacific Software Engineering Conference, APSEC 2012 - Hong Kong, China
Duration: Dec 4 2012Dec 7 2012

Other

Other19th Asia-Pacific Software Engineering Conference, APSEC 2012
CountryChina
CityHong Kong
Period12/4/1212/7/12

Fingerprint

Association rules
Explosions
Support vector machines
Logistics
Experiments

Keywords

  • association rule mining
  • data mining
  • defect prediction
  • empirical study
  • software quality

ASJC Scopus subject areas

  • Software

Cite this

Monden, A., Keung, J., Morisaki, S., Kamei, Y., & Matsumoto, K. I. (2012). A heuristic rule reduction approach to software fault-proneness prediction. In Proceedings - Asia-Pacific Software Engineering Conference, APSEC (Vol. 1, pp. 838-847). [6462753] IEEE Computer Society. https://doi.org/10.1109/APSEC.2012.103

A heuristic rule reduction approach to software fault-proneness prediction. / Monden, Akito; Keung, Jacky; Morisaki, Shuji; Kamei, Yasutaka; Matsumoto, Ken Ichi.

Proceedings - Asia-Pacific Software Engineering Conference, APSEC. Vol. 1 IEEE Computer Society, 2012. p. 838-847 6462753.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Monden, A, Keung, J, Morisaki, S, Kamei, Y & Matsumoto, KI 2012, A heuristic rule reduction approach to software fault-proneness prediction. in Proceedings - Asia-Pacific Software Engineering Conference, APSEC. vol. 1, 6462753, IEEE Computer Society, pp. 838-847, 19th Asia-Pacific Software Engineering Conference, APSEC 2012, Hong Kong, China, 12/4/12. https://doi.org/10.1109/APSEC.2012.103
Monden A, Keung J, Morisaki S, Kamei Y, Matsumoto KI. A heuristic rule reduction approach to software fault-proneness prediction. In Proceedings - Asia-Pacific Software Engineering Conference, APSEC. Vol. 1. IEEE Computer Society. 2012. p. 838-847. 6462753 https://doi.org/10.1109/APSEC.2012.103
Monden, Akito ; Keung, Jacky ; Morisaki, Shuji ; Kamei, Yasutaka ; Matsumoto, Ken Ichi. / A heuristic rule reduction approach to software fault-proneness prediction. Proceedings - Asia-Pacific Software Engineering Conference, APSEC. Vol. 1 IEEE Computer Society, 2012. pp. 838-847
@inproceedings{29b4b6b6d1484a8a997b14beef52b4f2,
title = "A heuristic rule reduction approach to software fault-proneness prediction",
abstract = "Background: Association rules are more comprehensive and understandable than fault-prone module predictors (such as logistic regression model, random forest and support vector machine). One of the challenges is that there are usually too many similar rules to be extracted by the rule mining. Aim: This paper proposes a rule reduction technique that can eliminate complex (long) and/or similar rules without sacrificing the prediction performance as much as possible. Method: The notion of the method is to removing long and similar rules unless their confidence level as a heuristic is high enough than shorter rules. For example, it starts with selecting rules with shortest length (length=1), and then it continues through the 2nd shortest rules selection (length=2) based on the current confidence level, this process is repeated on the selection for longer rules until no rules are worth included. Result: An empirical experiment has been conducted with the Mylyn and Eclipse PDE datasets. The result of the Mylyn dataset showed the proposed method was able to reduce the number of rules from 1347 down to 13, while the delta of the prediction performance was only. 015 (from. 757 down to. 742) in terms of the F1 prediction criteria. In the experiment with Eclipsed PDE dataset, the proposed method reduced the number of rules from 398 to 12, while the prediction performance even improved (from. 426 to. 441.) Conclusion: The novel technique introduced resolves the rule explosion problem in association rule mining for software proneness prediction, which is significant and provides better understanding of the causes of faulty modules.",
keywords = "association rule mining, data mining, defect prediction, empirical study, software quality",
author = "Akito Monden and Jacky Keung and Shuji Morisaki and Yasutaka Kamei and Matsumoto, {Ken Ichi}",
year = "2012",
doi = "10.1109/APSEC.2012.103",
language = "English",
isbn = "9780769549224",
volume = "1",
pages = "838--847",
booktitle = "Proceedings - Asia-Pacific Software Engineering Conference, APSEC",
publisher = "IEEE Computer Society",
address = "United States",

}

TY - GEN

T1 - A heuristic rule reduction approach to software fault-proneness prediction

AU - Monden, Akito

AU - Keung, Jacky

AU - Morisaki, Shuji

AU - Kamei, Yasutaka

AU - Matsumoto, Ken Ichi

PY - 2012

Y1 - 2012

N2 - Background: Association rules are more comprehensive and understandable than fault-prone module predictors (such as logistic regression model, random forest and support vector machine). One of the challenges is that there are usually too many similar rules to be extracted by the rule mining. Aim: This paper proposes a rule reduction technique that can eliminate complex (long) and/or similar rules without sacrificing the prediction performance as much as possible. Method: The notion of the method is to removing long and similar rules unless their confidence level as a heuristic is high enough than shorter rules. For example, it starts with selecting rules with shortest length (length=1), and then it continues through the 2nd shortest rules selection (length=2) based on the current confidence level, this process is repeated on the selection for longer rules until no rules are worth included. Result: An empirical experiment has been conducted with the Mylyn and Eclipse PDE datasets. The result of the Mylyn dataset showed the proposed method was able to reduce the number of rules from 1347 down to 13, while the delta of the prediction performance was only. 015 (from. 757 down to. 742) in terms of the F1 prediction criteria. In the experiment with Eclipsed PDE dataset, the proposed method reduced the number of rules from 398 to 12, while the prediction performance even improved (from. 426 to. 441.) Conclusion: The novel technique introduced resolves the rule explosion problem in association rule mining for software proneness prediction, which is significant and provides better understanding of the causes of faulty modules.

AB - Background: Association rules are more comprehensive and understandable than fault-prone module predictors (such as logistic regression model, random forest and support vector machine). One of the challenges is that there are usually too many similar rules to be extracted by the rule mining. Aim: This paper proposes a rule reduction technique that can eliminate complex (long) and/or similar rules without sacrificing the prediction performance as much as possible. Method: The notion of the method is to removing long and similar rules unless their confidence level as a heuristic is high enough than shorter rules. For example, it starts with selecting rules with shortest length (length=1), and then it continues through the 2nd shortest rules selection (length=2) based on the current confidence level, this process is repeated on the selection for longer rules until no rules are worth included. Result: An empirical experiment has been conducted with the Mylyn and Eclipse PDE datasets. The result of the Mylyn dataset showed the proposed method was able to reduce the number of rules from 1347 down to 13, while the delta of the prediction performance was only. 015 (from. 757 down to. 742) in terms of the F1 prediction criteria. In the experiment with Eclipsed PDE dataset, the proposed method reduced the number of rules from 398 to 12, while the prediction performance even improved (from. 426 to. 441.) Conclusion: The novel technique introduced resolves the rule explosion problem in association rule mining for software proneness prediction, which is significant and provides better understanding of the causes of faulty modules.

KW - association rule mining

KW - data mining

KW - defect prediction

KW - empirical study

KW - software quality

UR - http://www.scopus.com/inward/record.url?scp=84874632768&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874632768&partnerID=8YFLogxK

U2 - 10.1109/APSEC.2012.103

DO - 10.1109/APSEC.2012.103

M3 - Conference contribution

SN - 9780769549224

VL - 1

SP - 838

EP - 847

BT - Proceedings - Asia-Pacific Software Engineering Conference, APSEC

PB - IEEE Computer Society

ER -