FIT data selection based on project features for software effort estimation models

Koji Toda, Akito Monden, Ken Ichi Matsumoto

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

To construct a better multivariate regression model for software effort estimation, this paper proposes a method to automatically select projects as fit data (a dataset for model construction) from a given project data set based on an estimation target's features. As a result of an experimental evaluation using the ISBSG data set, which is one of the most commonly used project data sets for effort estimation studies, the proposed method showed better estimation performance than the conventional method (of constructing a model using all project data). The median of MRE (Magnitude of Relative Error) was improved from 0.552 to 0.383, and also the median of MER (Magnitude of Error Relative) was improved from 0.457 to 0.381. While regression models were often constructed using all available project data, this paper showed the necessity of fit data selection, and showed that the proposed method is one of the effective and systematic means of doing the selection.

Original languageEnglish
Title of host publicationProceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010
Pages82-88
Number of pages7
Publication statusPublished - 2010
Externally publishedYes
Event6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010 - Sharm El Sheikh, Egypt
Duration: Mar 15 2010Mar 17 2010

Other

Other6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010
CountryEgypt
CitySharm El Sheikh
Period3/15/103/17/10

Keywords

  • Effort estimation
  • Fit data selection
  • Multivariate regression

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Cite this

Toda, K., Monden, A., & Matsumoto, K. I. (2010). FIT data selection based on project features for software effort estimation models. In Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010 (pp. 82-88)

FIT data selection based on project features for software effort estimation models. / Toda, Koji; Monden, Akito; Matsumoto, Ken Ichi.

Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010. 2010. p. 82-88.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Toda, K, Monden, A & Matsumoto, KI 2010, FIT data selection based on project features for software effort estimation models. in Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010. pp. 82-88, 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010, Sharm El Sheikh, Egypt, 3/15/10.
Toda K, Monden A, Matsumoto KI. FIT data selection based on project features for software effort estimation models. In Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010. 2010. p. 82-88
Toda, Koji ; Monden, Akito ; Matsumoto, Ken Ichi. / FIT data selection based on project features for software effort estimation models. Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010. 2010. pp. 82-88
@inproceedings{75fa741a4e8d45a9bbb77b1883c5a7f9,
title = "FIT data selection based on project features for software effort estimation models",
abstract = "To construct a better multivariate regression model for software effort estimation, this paper proposes a method to automatically select projects as fit data (a dataset for model construction) from a given project data set based on an estimation target's features. As a result of an experimental evaluation using the ISBSG data set, which is one of the most commonly used project data sets for effort estimation studies, the proposed method showed better estimation performance than the conventional method (of constructing a model using all project data). The median of MRE (Magnitude of Relative Error) was improved from 0.552 to 0.383, and also the median of MER (Magnitude of Error Relative) was improved from 0.457 to 0.381. While regression models were often constructed using all available project data, this paper showed the necessity of fit data selection, and showed that the proposed method is one of the effective and systematic means of doing the selection.",
keywords = "Effort estimation, Fit data selection, Multivariate regression",
author = "Koji Toda and Akito Monden and Matsumoto, {Ken Ichi}",
year = "2010",
language = "English",
isbn = "9780889868311",
pages = "82--88",
booktitle = "Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010",

}

TY - GEN

T1 - FIT data selection based on project features for software effort estimation models

AU - Toda, Koji

AU - Monden, Akito

AU - Matsumoto, Ken Ichi

PY - 2010

Y1 - 2010

N2 - To construct a better multivariate regression model for software effort estimation, this paper proposes a method to automatically select projects as fit data (a dataset for model construction) from a given project data set based on an estimation target's features. As a result of an experimental evaluation using the ISBSG data set, which is one of the most commonly used project data sets for effort estimation studies, the proposed method showed better estimation performance than the conventional method (of constructing a model using all project data). The median of MRE (Magnitude of Relative Error) was improved from 0.552 to 0.383, and also the median of MER (Magnitude of Error Relative) was improved from 0.457 to 0.381. While regression models were often constructed using all available project data, this paper showed the necessity of fit data selection, and showed that the proposed method is one of the effective and systematic means of doing the selection.

AB - To construct a better multivariate regression model for software effort estimation, this paper proposes a method to automatically select projects as fit data (a dataset for model construction) from a given project data set based on an estimation target's features. As a result of an experimental evaluation using the ISBSG data set, which is one of the most commonly used project data sets for effort estimation studies, the proposed method showed better estimation performance than the conventional method (of constructing a model using all project data). The median of MRE (Magnitude of Relative Error) was improved from 0.552 to 0.383, and also the median of MER (Magnitude of Error Relative) was improved from 0.457 to 0.381. While regression models were often constructed using all available project data, this paper showed the necessity of fit data selection, and showed that the proposed method is one of the effective and systematic means of doing the selection.

KW - Effort estimation

KW - Fit data selection

KW - Multivariate regression

UR - http://www.scopus.com/inward/record.url?scp=84858823233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858823233&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84858823233

SN - 9780889868311

SP - 82

EP - 88

BT - Proceedings of the 6th IASTED International Conference on Advances in Computer Science and Engineering, ACSE 2010

ER -