
Learning Component Size Distributions for Software Cost Estimation: Models Based on Arithmetic and Shifted Geometric Means Rules
Author(s) -
Shachi Sharma,
Parag C. Pendharkar,
Karmeshu
Publication year - 2021
Publication title -
ieee transactions on software engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.857
H-Index - 169
eISSN - 1939-3520
pISSN - 0098-5589
DOI - 10.1109/tse.2021.3139216
Subject(s) - computing and processing
Understanding software size distribution is critical to software cost estimation using COCOMO model and design of reliable production function model. This paper proposes and validates a theoretical framework based on the maximization of Shannon entropy to learn component size distribution of software systems when partial information about the moments is given. Specification of appropriate moment constraints either in the form of shifted geometric mean or arithmetic mean or both geometric and arithmetic means are considered. The models are validated using 30 real datasets. The analysis reveals that software systems where component sizes depict power-law behavior are governed by shifted geometric mean whereas those systems in which component size distribution shows exponential behavior are described by arithmetic mean. Another type of software system is also considered where the component size distribution is found to depict gamma distribution. Such systems are characterized by specification of both arithmetic and geometric means. The study underlines that the use of modern object-oriented programming languages adheres to power-law distribution indicating the existence of team synergies leading to substantial containment of software costs when compared to the use of traditional procedural programming languages.