Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
Author(s) -
Filipe Dwan Pereira,
Samuel C. Fonseca,
Elaine Oliveira,
David Braga Fernandes de Oliveira,
Alexandra I. Cristea,
Leandro Silva Galvão de Carvalho
Publication year - 2020
Publication title -
revista brasileira de informática na educação
Language(s) - English
Resource type - Journals
eISSN - 2317-6121
pISSN - 1414-5685
DOI - 10.5753/rbie.2020.28.0.723
Subject(s) - artificial intelligence , computer science , machine learning , learning analytics , dropout (neural networks) , notice , mathematics education , psychology , political science , law
Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom