Optimization of classifiers ensemble construction: case study of educational data mining

Бесплатный доступ

The choosing the best prediction method of education results is major challenge of Educational Data Mining (EDM). This EDM paper compares the results of student's performance forecast produced by the individual binary classifiers (Naïve Bayes, Decision Tree, Multi-Layer Perceptron, Nearest Neighbors, Support Vector Machine algorithms) and their ensembles, which are trained (tested) on dataset containing up to 38 input attributes (weekly attendance in mathematics, the intensity of study, interim assessment) of 84 (36) secondary school students from Nasiriyah, Iraq. The two-class school performance was predicted - passing or not passing on final exam. Three following stages of comparison were completed. At the first stage of the experiment, the dependence of classifiers from the input attributes was investigated. It was shown that the forecast accuracy rises from 61.1-77.7% when all 38 attributes were used, to 75.0-80.5%, if base classifier trained with five attributes pre-selected by Ranker Search method. Then, in second stage, to each of the base classifier the AdaBoost M1 procedure has been applied and five homogenous ensembles were created. And only two of these ensembles demonstrated small rise of 3% in accuracy comparing to corresponding stand-alone classifier, but the overall maximal prediction accuracy of 80.5% stayed the same. Finally, comparing the accuracies of 77.7% and 83.3% achieved by the heterogeneous ensemble consisted of five simple voting base classifiers and by the heterogeneous meta-ensemble of five simple voting AdaBoost homogenous ensembles correspondingly, we conclude that improvement of the quality of the individual classifier or homogeneous ensembles allows to construct more powerful EDM prediction methods.

Еще

Base classifiers, educational data mining, ranker search method, adaptive boosting, heterogeneous ensembles, метод селекции атрибутов ranker search, adaboost

Короткий адрес: https://sciup.org/147232279

IDR: 147232279   |   DOI: 10.14529/ctcr190414

Список литературы Optimization of classifiers ensemble construction: case study of educational data mining

  • Romero C., Ventura S. Educational Data Mining: A Review of the State of the Art // IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., 2010, vol. 40, no. 6, pp. 601-618. DOI: 10.1109/TSMCC.2010.2053532
  • U.S. Department of Education, Office of Educational Technology // Enhancing Teaching and Learning Through Educational Data Mining and Learning Analytics: An Issue Brief. Washington, D.C., 2012, Available at: https://tech.ed.gov/learning-analytics/edm-la-brief.pdf (accessed: 03.07.2018).
  • Baker R.S., Inventado P.S. Educational Data Mining and Learning Analytics // In: Larusson J., White B. (Eds.). Learning Analytics. Springer, New York, NY, 2014, pp. 61-75. DOI: 10.1007/978-1-4614-3305-7_4
  • Calvet Liñán L., Juan Pérez Á.A. Educational Data Mining and Learning Analytics: Differences, Similarities, and Time Evolution // RUSC. Universities and Knowledge Society Journal, 2015, vol. 12, no. 3, pp. 98-112. DOI: 10.7238/rusc.v12i3.2515
  • Jovanovic M., Vukicevic M., Milovanovic M., Minovic M. Using Data Mining on Student Behavior and Cognitive Style Data for Improving E-Learning Systems: a Case Study // I. Journal of Computational Intelligence Systems, 2012, vol. 5, no. 3, pp. 597-610. DOI: 10.1080/18756891.2012.696923
  • Berland М., Baker R.S., Blikstein P. Educational Data Mining and Learning Analytics: Applications to Constructionist Research // Tech Know Learn., 2014, vol. 19, pp. 205-220.
  • DOI: 10.1007/s10758-014-9223-7
  • Slater S., Joksimovic S., Kovanovic V., et al. Tools for Educational Data Mining: A Review // Journal of Educational and Behavioral Statistics, 2017, vol. 42, no. 1, pp. 85-106.
  • DOI: 10.3102/1076998616666808
  • Castro-Wunsch K., Ahadi A., Petersen A. Evaluating Neural Networks as a Method for Identifying Students in Need of Assistance // SIGCSE' 17, March 08-11, 2017, Seattle, WA, USA.
  • DOI: 10.1145/3017680.3017792
  • Hussain S., Fadhil M.Z., Salal Y.K., Theodoru P., Kurtoğlu F., Hazarika G.C. Prediction Model on Student Performance Based on Internal Assessment Using Deep Learning // I. Journal of Emerging Technologies in Learning, 2019, vol. 14, no. 8, pp. 4-22.
  • DOI: 10.3991/ijet.v14i08.10001
  • Wu X., Kumar V., Quinlan R.J. et al. Top 10 Algorithms in Data Mining // Knowl. Inf. Syst., 2008, vol. 14, pp. 1-37.
  • DOI: 10.1007/s10115-007-0114-2
  • Kumar M., Salal Y.K. Systematic Review of Predicting Student's Performance in Academics // I. J. of Engineering and Advanced Tech., 2019, vol. 8, no. 3, рp. 54-61.
  • Smith-Miles K.A. Cross-Disciplinary Perspectives on Meta-Learning for Algorithm Selection // ACM Comput. Surv., 2008, vol. 41, no. 1, Article 6, 25 p.
  • DOI: 10.1145/1456650.1456656
  • Vilalta R., Giraud-Carrier C., Brazdil P. Meta-Learning - Concepts and Techniques // In: Data Mining and Knowledge Discovery Handbook, Springer, 2010, pp. 717-732.
  • DOI: 10.1007/978-0-387-09823-4_36
  • Salal Y.K., Abdullaev S.M., Kumar M. Educational Data Mining: Student Performance Prediction in Academic // I. J. of Engineering and Advanced Tech., 2019, vol. 8, no. 4C, pp. 54-59.
  • Trabelsi M., Meddouri N., Maddouri M. A New Feature Selection Method for Nominal Classifier Based on Formal Concept Analysis // Procedia Computer Science, 2017, vol. 112, pp. 186-194.
  • DOI: 10.1016/j.procs.2017.08.227
  • Quinlan J.R. Induction of Decision Trees // Machine Learning, 1986, no. 1, pp. 81-106.
  • DOI: 10.1007/BF00116251
  • Kohavi R., John G.H. Wrappers for Feature Subset Selection // Artificial Intelligence (97), 1997, pp. 273-324
  • DOI: 10.1016/S0004-3702(97)00043-X
  • Freund Y., Schapire R.E. A Short Introduction to Boosting // J. of Japanese Society for Artificial Intelligence, 1999, vol. 14, no. 5, pp. 771-780.
Еще
Краткое сообщение