In this study, we quantitatively compare the effects of outlier handling methods in training datasets for model building on eight software effort estimation models (e.g., linear multiple regression, regression trees, random forests, support vector regression, etc.), and we evaluate the effectiveness of the data smoothing method proposed by the authors. In our experiments, we compare three outlier removal methods (outlier removal using Cook's distance, TEAK, and Filter-INC) in addition to the data smoothing method. Experimental results showed that the data smoothing method combined with the outlier detection method in Cook's distance or Filter-INC were found to build a model with good estimation performance.
ASJC Scopus subject areas