Evaluating the Performance of Machine Learning Classifier Algorithms for Software Estimation in Software Development Projects
DOI:
https://doi.org/10.21015/vtse.v12i1.1770Abstract
The major aim of this research is to rank the best performing features in order to classify the Software estimation dataset using SVM, Naïve Bayes, Random forest, Decision tree, and KNN classifiers and evaluate their accuracy. Two steps are involved in the classification process: first, the dataset with all attributes is analyzed; second, the information gain methodology is used to rank the attributes, and only the highly rated ones are used to generate the model of classification. Using several folds of cross-validation, we assess the accuracy rank of SVM, Naïve Bayes, Decision tree, Random forest, and KNN classifier
References
J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001.
X. Xiong et al., "Analysis of software estimation using Data Mining & Statistical Techniques," in IEEE Proceedings
of 6th International Conference on Software Engineering, 2015, pp. 82-87.
J. Sayyad Shirabad and T. J. Menzies, "The PROMISE Repository of Software Engineering Databases," School
of Information Technology and Engineering, University of Ottawa, Canada. [Online]. Available: http://promise. site.uottawa.ca/SERepository
M. Prasad, "Online Feature Selection for Classification," International Journal of Computational Intelligence Systems, vol. 1, no. 2, 2018, pp. 127-133.
D. Delen, G. Walker, and A. Kadam, "Predicting software effort estimation: A comparison of three data mining
methods," Artificial Intelligence in Medicine, vol. 34, no. 2, 2015, pp. 113-127.
H. B. Burk et al., "Artificial Neural Networks Improve the Accuracy of software Prediction," 79(4), 2019, pp. 857862.
M. Lundin et al., "Artificial Neural Networks Applied to estimation of software efforts," Sustainability, vol. 57, no.
, 2020, pp. 281-286.
P. C. Pendharkar et al., "Association, Statistical, Mathematical and Neural Approaches for Mining," Expert systems
with Applications, vol. 17, 2019, pp. 223-232. [9]P. Clark and T. Niblett, "Induction in Noisy Domains," in
Progress in Machine Learning, eds. I. Bratko & N. Lavac,
Sigma Press, 2017, pp. 11-30.
P. Clark and T. Niblett, "The CN2 induction algorithm," Machine Learning Journal, vol. 3, no. 4, 2018, pp. 261283.
G. Cestnik et al., "Assistant-86: A Knowledge Elicitation Tool for Sophisticated Users," in Progress in Machine
Learning, eds. I. Bratko & N. Lavrac, Sigma Press, 2017, pp. 31-45.
H. Zhang and J. Su, "Naïve Bayesian Classifiers for Ranking," in Proceedings of 15th European Conference on Machine Learning, Springer, 2014, pp. 501-512.
L. Jianq and Y. Guo, "Learning Lazy Naïve Bayesian Classifiers for Ranking," in Proceedings of 17th International
Conference on Tools with Artificial Intelligence, 2015, pp. 412-416.
J. Huang et al., "Comparing Naïve Bayes, Decision Trees and SVM with AUC and Accuracy," in Proceedings of 3rd
International Conference on Datamining, IEEE Computer Society Press, 2013, pp. 553-556.
D. T. Larase, Discovering Knowledge in Data. An introduction to Data mining, John Wiley & Sons, Inc, 2015.
I. H. Witten and E. Frank, Datamining: Practical Machine Learning Tools and Techniques, 2nd edn., Elsevier, 2015.
Mannan, A., Qamar, R., & Arshad, S. (2024). SemiAutomated Approach for Evaluation of Software Defect
Management Process using ML Approach. VAWKUM Transactions on Computer Sciences, 12(1), 20-33.
R. Qamar, R. Asif, L. F. Naz, A. Mannan, & A. Hussain, "FlightForecast: A Comparative Analysis of Stack LSTM
and Vanilla LSTM Models for Flight Prediction," VFAST Transactions on Software Engineering, vol. 12, no. 1, pp.
-24, 2024.
M. A. Mannan and A. Ansari, "SPMM: A Model Taxonomy for Designing and Managing Quality System," in IEEE Access, vol. 10, pp. 76720-76730, 2022. doi: 10.1109/ACCESS.2022.3190081.
E. C. Ltd, “Software effort estimation,” GitHub, Available: https://github.com/edusoftresearch/SEEData, 2023.
Y. Mahmood, N. Kama, A. Azmi, A. S. Khan, and M. Ali, “Software effort estimation accuracy prediction of machine
learning techniques: A systematic performance evaluation,” Software: Practice and Experience, vol. 52, no.
, pp. 39–65, 2022.
A. Jadhav, M. Kaur, and F. Akter, “Evolution of software development effort and cost estimation techniques:
five decades study using automated text mining approach,” Mathematical Problems in Engineering, vol. 2022, pp. 1–17, 2022.
P. Suresh Kumar, H. Behera, J. Nayak, and B. Naik, “A pragmatic ensemble learning approach for effective
software effort estimation,” Innovations in Systems and Software Engineering, vol. 18, no. 2, pp. 283–299, 2022.
P. V. AG and V. Varadarajan, “Estimating software development efforts using a random forest-based stacked
ensemble approach,” Electronics, vol. 10, no. 10, p. 1195, 2021.
Z. R. Mohsin, “Comparative study for software effort estimation by soft computing models,” Journal of Education
for Pure Science-University of Thi-Qar, vol. 11, no. 2, pp. 108–120, 2021.
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC-By) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This work is licensed under a Creative Commons Attribution License CC BY