An effect of data size on performance of effort estimation with missing data techniques

Koichi Tamura, Akito Monden, Ken Ichi Matsumoto

Research output: Contribution to journalArticlepeer-review

Abstract

To deal with missing data in historical project data sets is an important issue for constructing effort estimation models. Past researches have showed that the similarity-based imputation showed high estimation performance. However, it is unclear if it is still effective for small data sets. In this paper, using multiple data sets with different project cases each extracted from ISBSG data set, we present an experimental evaluation among four methods: mean imputation, similarity-based imputation, row-column deletion and pairwise deletion. The result showed that the row-column deletion showed better performance than similarity-based imputation for data sets not exceeding 220 cases.

Original languageEnglish
Pages (from-to)100-105
Number of pages6
JournalComputer Software
Volume27
Issue number2
Publication statusPublished - May 1 2010
Externally publishedYes

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'An effect of data size on performance of effort estimation with missing data techniques'. Together they form a unique fingerprint.

Cite this