TY - JOUR
T1 - Using machine learning to predict factors affecting academic performance
T2 - the case of college students on academic probation
AU - Al-Alawi, Lamees
AU - Al Shaqsi, Jamil
AU - Tarhini, Ali
AU - Al-Busaidi, Adil S.
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023
Y1 - 2023
N2 - This study aims to employ the supervised machine learning algorithms to examine factors that negatively impacted academic performance among college students on probation (underperforming students). We used the Knowledge Discovery in Databases (KDD) methodology on a sample of N = 6514 college students spanning 11 years (from 2009 to 2019) provided by a major public university in Oman. We used the Information Gain (InfoGain) algorithm to select the most effective features and ensemble methods to compare the accuracy with more robust algorithms, including Logit Boost, Vote, and Bagging. The algorithms were evaluated based on the performance evaluation metrics such as accuracy, precision, recall, F-measure, and ROC curve, and then validated using 10-folds cross-validation. The study revealed that the main identified factors affecting student academic achievement include study duration in the university and previous performance in secondary school. Based on the experimental results, these features were consistently ranked as the top factors that negatively impacted academic performance. The study also indicated that gender, estimated graduation year, cohort, and academic specialization significantly contributed to whether a student was under probation. Domain experts and other students were involved in verifying some of the results. The theoretical and practical implications of this study are discussed.
AB - This study aims to employ the supervised machine learning algorithms to examine factors that negatively impacted academic performance among college students on probation (underperforming students). We used the Knowledge Discovery in Databases (KDD) methodology on a sample of N = 6514 college students spanning 11 years (from 2009 to 2019) provided by a major public university in Oman. We used the Information Gain (InfoGain) algorithm to select the most effective features and ensemble methods to compare the accuracy with more robust algorithms, including Logit Boost, Vote, and Bagging. The algorithms were evaluated based on the performance evaluation metrics such as accuracy, precision, recall, F-measure, and ROC curve, and then validated using 10-folds cross-validation. The study revealed that the main identified factors affecting student academic achievement include study duration in the university and previous performance in secondary school. Based on the experimental results, these features were consistently ranked as the top factors that negatively impacted academic performance. The study also indicated that gender, estimated graduation year, cohort, and academic specialization significantly contributed to whether a student was under probation. Domain experts and other students were involved in verifying some of the results. The theoretical and practical implications of this study are discussed.
KW - Academic under probation
KW - Data Mining
KW - Education Data Mining
KW - Higher education
KW - Oman
KW - Predictive models
KW - Student Academic performance
KW - Supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85149778890&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149778890&partnerID=8YFLogxK
U2 - 10.1007/s10639-023-11700-0
DO - 10.1007/s10639-023-11700-0
M3 - Article
AN - SCOPUS:85149778890
SN - 1360-2357
JO - Education and Information Technologies
JF - Education and Information Technologies
ER -