
Effect of Dimensionality Reduction on Prediction Accuracy of Effort of Agile Projects Using Principal Component Analysis
Author(s) -
Manju Vyas,
Naveen Hemrajani
Publication year - 2021
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/1099/1/012008
Subject(s) - principal component analysis , python (programming language) , dimensionality reduction , agile software development , computer science , mean squared error , software , data mining , artificial intelligence , curse of dimensionality , machine learning , computation , feature (linguistics) , dimension (graph theory) , statistics , mathematics , algorithm , software engineering , linguistics , philosophy , pure mathematics , programming language , operating system
Agile framework for software development has received a lot of recognition in software industry in previous years as it focuses on rapid incremental delivery, lower risk and customer satisfaction. At early stages of development, the effort must be predicted so that the project is completed successfully within the time and cost deadlines. In recent years, various researchers have done study in this area and it is observed that the prediction of effort faces a problem of large dimension of features. Hence the prediction accuracy may be increased by reducing the dimensions of the features. In this paper, PCA has been used for reduction of feature dimensions for effort estimation. PCA identifies the key attributes by reducing the dimensions of the attribute which are those having highest correlation with the effort. The methodology shows the effect of PCA on the original dataset and the results are observed by applying various machine learning techniques pre and post PCA. The comparison metrics used are Mean Magnitude relative Error (MMRE), Root Mean Square Error (RMSE), and Prediction Accuracy (PRED (25)). The decreased values of errors and increased value of accuracy shows the better model accuracy when PCA is applied on the dataset. All the computations and implementations in this paper are done using Python on Scikit-learn library.