
Modelling Student’s Performance Using Data Mining Techniques in a Higher Learning Environment in the Pacific
Author(s) -
Ravneil Nand,
Ashneel Chand
Publication year - 2020
Publication title -
international journal of neural networks and advanced applications
Language(s) - English
Resource type - Journals
ISSN - 2313-0563
DOI - 10.46300/91016.2020.7.10
Subject(s) - decision tree , computer science , artificial neural network , machine learning , artificial intelligence , data mining , naive bayes classifier , set (abstract data type) , relation (database) , decision tree learning , data set , educational data mining , tree (set theory) , mathematics , mathematical analysis , support vector machine , programming language
The students’ performance in higher education has become one of the most widely studied area. Modelling student performance play a pivotal role in forecasting students’ performance where the data mining applications are now becoming most widely used techniques in this study. There are various factors, which determine the student performance. Eight attributes are used as input, which is considered most influential in determining students’ performance in the Pacific. Statistical analysis is done to see which attribute has the highest influence to student performance. In this research, different algorithms are utilized for building the classification model, each of them using various classification techniques. Some of classification techniques used are Artificial Neural Network, Decision Tree, Decision Table, and Naïve Bayes. The WEKA explorer application and R software are used for correlation test between different variables. The dataset used in this research is an imbalanced set, which is later transformed to balance set through under sampling. Neural Network is one of the classification techniques that has done well on both, imbalanced and balanced dataset. Another technique which has done well is Decision tree. Statistical analysis shows that internal assessment has weak positive relationship with student performance while demographic data is not. Further observations are reported in this research in relation to two types of datasets with application to different classification techniques