An empirical evaluation of outlier deletion methods for analogy-based cost estimation

Masateru Tsunoda, Akito Monden, Takeshi Kakimoto, Ken Ichi Matsumoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

Background: Any software project dataset sometimes includes outliers which affect the accuracy of effort estimation. Outlier deletion methods are often used to eliminate them. However, there are few case studies which apply outlier deletion methods to analogy-based estimation, so it is not clear which method is more suitable for analogy-based estimation. Aim: Clarifying the effects of existing outlier deletion methods (Cook's distance based deletion, LTS based deletion, k-means based deletion, Mantel's correlation based deletion, and EID based deletion) and our method for analogy-based estimation. Method: In the experiment, outlier deletion methods were applied to three kinds of datasets (the ISBSG, Kitchenham, and Desharnais datasets), and their estimation accuracy evaluated based on BRE (Balanced Relative Error). Our method eliminates outliers from the neighborhoods of a target project when the effort is extremely different from other neighborhoods. Results: Deletion methods which are designed to apply to analogy-based estimation (i.e. Mantel's correlation based deletion, EID based deletion, and our method) showed stable performance. Especially, only our method showed over 10% improvement of the average BRE on two datasets. Conclusions: It is reasonable to apply deletion methods designed for analogy-based estimation, and more preferable to apply our method to analogybased estimation.

Original languageEnglish
Title of host publicationPROMISE 2011 - 7th International Conference on Predictive Models in Software Engineering, Co-located with ESEM 2011
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event7th International Conference on Predictive Models in Software Engineering, PROMISE 2011, Co-located with ESEM 2011 - Banff, AB, Canada
Duration: Sept 20 2011Sept 21 2011

Publication series

NameACM International Conference Proceeding Series

Other

Other7th International Conference on Predictive Models in Software Engineering, PROMISE 2011, Co-located with ESEM 2011
Country/TerritoryCanada
CityBanff, AB
Period9/20/119/21/11

Keywords

  • Abnormal value
  • Case based reasoning
  • Effort prediction
  • Productivity
  • Project management

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'An empirical evaluation of outlier deletion methods for analogy-based cost estimation'. Together they form a unique fingerprint.

Cite this