TY - GEN
T1 - Improving analogy-based software cost estimation through probabilistic-based similarity measures
AU - Phannachitta, Passakorn
AU - Keung, Jacky
AU - Monden, Akito
AU - Matsumoto, Ken Ichi
PY - 2013/1/1
Y1 - 2013/1/1
N2 - The performance of software cost estimation based on analogy reasoning depends upon the measures that specifying the similarity between software projects. This paper empirically investigates the use of probabilistic-based distance functions to improve the similarity measurement. The probabilistic-based distance functions are considerably more robust, because they collect the implicit correlation between the occurrences of project feature attributes. This information gain enables the constructed estimation model to be more concise and comprehensible. The study compares 6 probabilistic-based distance functions against the commonlyused Euclidian distance. We empirically evaluate the implemented cost estimation model using 5 real-world datasets collected from the PROMISE repository. The result shows a significant improvement in terms of error reduction, that implies an estimation based on probabilistic-based distance functions achieve higher accuracy on average, and the peak performance significantly outperforms the Euclidian distance based on Wilcoxon signed-rank test.
AB - The performance of software cost estimation based on analogy reasoning depends upon the measures that specifying the similarity between software projects. This paper empirically investigates the use of probabilistic-based distance functions to improve the similarity measurement. The probabilistic-based distance functions are considerably more robust, because they collect the implicit correlation between the occurrences of project feature attributes. This information gain enables the constructed estimation model to be more concise and comprehensible. The study compares 6 probabilistic-based distance functions against the commonlyused Euclidian distance. We empirically evaluate the implemented cost estimation model using 5 real-world datasets collected from the PROMISE repository. The result shows a significant improvement in terms of error reduction, that implies an estimation based on probabilistic-based distance functions achieve higher accuracy on average, and the peak performance significantly outperforms the Euclidian distance based on Wilcoxon signed-rank test.
KW - Analogy
KW - Heterogeneous distance function
KW - Machine learning
KW - Software cost estimation
UR - http://www.scopus.com/inward/record.url?scp=84936889065&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84936889065&partnerID=8YFLogxK
U2 - 10.1109/APSEC.2013.78
DO - 10.1109/APSEC.2013.78
M3 - Conference contribution
AN - SCOPUS:84936889065
SN - 9780769549224
T3 - Proceedings - Asia-Pacific Software Engineering Conference, APSEC
SP - 541
EP - 546
BT - APSEC 2013 - Proceedings of the 20th Asia-Pacific Software Engineering Conference
A2 - Muenchaisri, Pornsiri
A2 - Rothermel, Gregg
PB - IEEE Computer Society
T2 - 20th Asia-Pacific Software Engineering Conference, APSEC 2013
Y2 - 2 December 2013 through 5 December 2013
ER -