Selecting best predictors from large software repositories for highly accurate software effort estimation | Zendy

Tariq Sidra | Zendy; Usman Muhammad | Zendy; Fong Alvis C.M. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Selecting best predictors from large software repositories for highly accurate software effort estimation

Author(s) -

Tariq Sidra,

Usman Muhammad,

Fong Alvis C.M.

Publication year - 2020

Publication title -

journal of software: evolution and process

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.371

H-Index - 29

eISSN - 2047-7481

pISSN - 2047-7473

DOI - 10.1002/smr.2271

Subject(s) - computer science , feature selection , machine learning , software , preprocessor , data pre processing , data mining , artificial intelligence , software development , task (project management) , scheduling (production processes) , data science , systems engineering , engineering , operations management , economics , programming language

Accurate prediction of software effort is important for planning, scheduling, and allocating resources. However, software effort estimation has been a challenging task. Although numerous estimation models have been proposed, few achieve anything close to accurate prediction of software development effort. To achieve optimal results, machine learning techniques have recently been employed for predicting software development effort using relatively large software repositories. However, some issues remain unresolved, and this paper aims to address the following issues. First, feature selection methods often neglected the information rich variables present in the dataset. Second, selection of important features was done through statistical methods, which lack domain knowledge. Third, missing values in the data that significantly influence the prediction outcome was not efficiently handled. Fourth, majority of the literature neglected advanced evaluation measures, which thoroughly evaluate the ability of learning models to produce accurate results. To address the above issues, a machine learning‐based model has been proposed in this paper, which not only allows effective preprocessing of data but also provides highly accurate prediction results with minimum error rate. The purpose is to best identify attributes (predictors) from large software repositories that are most influential in the estimation of effort. In addition, we apply MMRE for better performance analysis.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research