Impact of Task Clarity on Project Duration Prediction in Competitive Crowdsourced Software Development
Author(s) -
Kareem Ullah,
Imran Mumtaz,
Muhammad Azam Zia,
Muhammad Kashif
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3621482
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Competitive crowdsourcing in software development (CCSD) is an emerging paradigm that delivers innovative, cost-effective, and high-quality software solutions within constrained timelines. Accurate prediction of project duration is essential for effective scheduling, resource allocation, and risk mitigation in such dynamic environments. This study addresses the challenge of estimating duration, motivated by frequent inaccuracies observed in crowd-sourced platforms, which often result in suboptimal planning and delivery delays. We propose predicting project duration using the Task Clarity (PDTC) model, which uses semantic representations of task descriptions to enhance prediction accuracy. Utilizing the Topcoder dataset, task descriptions are preprocessed, annotated with clarity labels, and transformed into semantic embeddings using BERT model. Multiple machine learning models, including SVM and Naïve Bayes, were evaluated and compared against the proposed PDTC model using both classification and regression metrics. Experimental results demonstrate that incorporating task clarity, particularly through BERT-based embeddings, significantly improves predictive performance, achieving an accuracy of 0.97 and an R² score of 0.97. The PDTC model consistently outperformed traditional ML techniques and baseline models such as Zero Rule and Random Prediction. These findings underscore the importance of well-articulated task descriptions and deep semantic modeling in enhancing the robustness of project duration estimation in competitive crowd-sourcing environments.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom