Evaluating the Performance of Machine Learning Classiﬁer Algorithms for Software Estimation in Software Development Projects

Muhammad Adeel Mannan; Rohail Qamar; Iqbal Uddin Khan; Afzal Hussain; Saad Ahmed; Jahangir Khan

doi:10.21015/vtse.v12i1.1770

Authors

Muhammad Adeel Mannan Department of Computing, Faculty of Engineering Sciences & Technology, Hamdard University, Karachi, Pakistan https://orcid.org/0000-0002-0811-4753
Rohail Qamar Department of Computer Science & Information Technology, NED University of Engineering & Technology, Karachi, Pakistan https://orcid.org/0000-0001-8697-6706
Iqbal Uddin Khan Department of Computing, Faculty of Engineering Sciences & Technology, Hamdard University, Karachi, Pakistan https://orcid.org/0000-0002-8296-6735
Afzal Hussain Department of Computing, Faculty of Engineering Sciences & Technology, Hamdard University, Karachi, Pakistan https://orcid.org/0009-0001-8217-2839
Saad Ahmed Department of Computer Science, IQRA University, Karachi, Pakistan https://orcid.org/0000-0001-6121-8124
Jahangir Khan Faculty of Engineering Sciences & Technology, Hamdard University, Karachi, Pakistan https://orcid.org/0009-0004-9265-6839

DOI:

https://doi.org/10.21015/vtse.v12i1.1770

Abstract

The major aim of this research is to rank the best performing features in order to classify the Software estimation dataset using SVM, Naïve Bayes, Random forest, Decision tree, and KNN classifiers and evaluate their accuracy. Two steps are involved in the classification process: first, the dataset with all attributes is analyzed; second, the information gain methodology is used to rank the attributes, and only the highly rated ones are used to generate the model of classification. Using several folds of cross-validation, we assess the accuracy rank of SVM, Naïve Bayes, Decision tree, Random forest, and KNN classifier

References

J. Han and M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001.

X. Xiong et al., "Analysis of software estimation using Data Mining & Statistical Techniques," in IEEE Proceedings

of 6th International Conference on Software Engineering, 2015, pp. 82-87.

J. Sayyad Shirabad and T. J. Menzies, "The PROMISE Repository of Software Engineering Databases," School

of Information Technology and Engineering, University of Ottawa, Canada. [Online]. Available: http://promise. site.uottawa.ca/SERepository

M. Prasad, "Online Feature Selection for Classiﬁcation," International Journal of Computational Intelligence Systems, vol. 1, no. 2, 2018, pp. 127-133.

D. Delen, G. Walker, and A. Kadam, "Predicting software effort estimation: A comparison of three data mining

methods," Artiﬁcial Intelligence in Medicine, vol. 34, no. 2, 2015, pp. 113-127.

H. B. Burk et al., "Artiﬁcial Neural Networks Improve the Accuracy of software Prediction," 79(4), 2019, pp. 857862.

M. Lundin et al., "Artiﬁcial Neural Networks Applied to estimation of software efforts," Sustainability, vol. 57, no.

, 2020, pp. 281-286.

P. C. Pendharkar et al., "Association, Statistical, Mathematical and Neural Approaches for Mining," Expert systems

with Applications, vol. 17, 2019, pp. 223-232. [9]P. Clark and T. Niblett, "Induction in Noisy Domains," in

Progress in Machine Learning, eds. I. Bratko & N. Lavac,

Sigma Press, 2017, pp. 11-30.

P. Clark and T. Niblett, "The CN2 induction algorithm," Machine Learning Journal, vol. 3, no. 4, 2018, pp. 261283.

G. Cestnik et al., "Assistant-86: A Knowledge Elicitation Tool for Sophisticated Users," in Progress in Machine

Learning, eds. I. Bratko & N. Lavrac, Sigma Press, 2017, pp. 31-45.

H. Zhang and J. Su, "Naïve Bayesian Classiﬁers for Ranking," in Proceedings of 15th European Conference on Machine Learning, Springer, 2014, pp. 501-512.

L. Jianq and Y. Guo, "Learning Lazy Naïve Bayesian Classiﬁers for Ranking," in Proceedings of 17th International

Conference on Tools with Artiﬁcial Intelligence, 2015, pp. 412-416.

J. Huang et al., "Comparing Naïve Bayes, Decision Trees and SVM with AUC and Accuracy," in Proceedings of 3rd

International Conference on Datamining, IEEE Computer Society Press, 2013, pp. 553-556.

D. T. Larase, Discovering Knowledge in Data. An introduction to Data mining, John Wiley & Sons, Inc, 2015.

I. H. Witten and E. Frank, Datamining: Practical Machine Learning Tools and Techniques, 2nd edn., Elsevier, 2015.

Mannan, A., Qamar, R., & Arshad, S. (2024). SemiAutomated Approach for Evaluation of Software Defect

Management Process using ML Approach. VAWKUM Transactions on Computer Sciences, 12(1), 20-33.

R. Qamar, R. Asif, L. F. Naz, A. Mannan, & A. Hussain, "FlightForecast: A Comparative Analysis of Stack LSTM

and Vanilla LSTM Models for Flight Prediction," VFAST Transactions on Software Engineering, vol. 12, no. 1, pp.

-24, 2024.

M. A. Mannan and A. Ansari, "SPMM: A Model Taxonomy for Designing and Managing Quality System," in IEEE Access, vol. 10, pp. 76720-76730, 2022. doi: 10.1109/ACCESS.2022.3190081.

E. C. Ltd, “Software effort estimation,” GitHub, Available: https://github.com/edusoftresearch/SEEData, 2023.

Y. Mahmood, N. Kama, A. Azmi, A. S. Khan, and M. Ali, “Software effort estimation accuracy prediction of machine

learning techniques: A systematic performance evaluation,” Software: Practice and Experience, vol. 52, no.

, pp. 39–65, 2022.

A. Jadhav, M. Kaur, and F. Akter, “Evolution of software development effort and cost estimation techniques:

ﬁve decades study using automated text mining approach,” Mathematical Problems in Engineering, vol. 2022, pp. 1–17, 2022.

P. Suresh Kumar, H. Behera, J. Nayak, and B. Naik, “A pragmatic ensemble learning approach for effective

software effort estimation,” Innovations in Systems and Software Engineering, vol. 18, no. 2, pp. 283–299, 2022.

P. V. AG and V. Varadarajan, “Estimating software development efforts using a random forest-based stacked

ensemble approach,” Electronics, vol. 10, no. 10, p. 1195, 2021.

Z. R. Mohsin, “Comparative study for software effort estimation by soft computing models,” Journal of Education

for Pure Science-University of Thi-Qar, vol. 11, no. 2, pp. 108–120, 2021.

Evaluating the Performance of Machine Learning Classiﬁer Algorithms for Software Estimation in Software Development Projects

Authors

DOI:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Information

ISSN

Make a Submission