A comparative analysis of ensemble autoML machine learning prediction accuracy of STEM student grade prediction
Published in Bioengineering & Biotechnology
Ensemble AutoML for Multiclass Student Grade Classification
In the early phases of research focused on predictive modeling, statisticians aim to explore the relationships between dependent and independent variables to improve classification outcomes. These relationships can exhibit either positive or negative correlations with target features, each carrying varying levels of reliability. This study addresses a key research gap by identifying an appropriate AutoML (Automated Machine Learning) model for multiclass classification of bachelor’s degree letter grades.
The primary objective is to evaluate the predictive accuracy of an ensemble AutoML approach for classifying student outcomes in science, technology, engineering, and management (STEM) disciplines. The classification is based on students’ academic history, including high school subject grades and internal assessments during their bachelor’s studies, with the goal of predicting final degree outcomes represented as letter grades in a modern multiclass grading system.
From a pool of 78 AutoML-recommended models, nine were selected for fine-tuning and cross-validation. These models were optimized for hyperparameters and evaluated based on performance metrics to determine their effectiveness in multiclass grade prediction. The models' classification accuracy, prediction error rates, and misclassification between training and predicted values were carefully analyzed.
Among the tested models, GBM_4_AutoML_1 achieved the lowest prediction error rate at 0.28 (28%), followed by StackedEnsemble_BestOfFamily_5 at 0.31 (31%), DRF at 0.28 (28%), XRT at 0.30 (30%), DeepLearning_grid at 0.56 (56%), and GLM at 0.35 (35%). Notably, the optimized GBM model achieved perfect accuracy (100%) in matching predicted grades with actual student outcomes, as indicated by its confusion matrix.
The historical scoring records of each model reflected the effectiveness of the tuned hyperparameters. Furthermore, a detailed analysis of feature importance was conducted to understand the contribution of each independent and dependent variable. This allowed for a nuanced comparison between true and predicted values in the multiclass classification of STEM student grades, offering comprehensive insight into model performance.
This research aims to compare ensemble models utilizing AutoML for both classification and regression tasks, focusing on predicting student academic outcomes based on prior academic performance and background information.
2. Methods and Data Preprocessing
The primary phase of this research commenced with the determination of an appropriate sample size using the following formula for a finite population:
Where:
-
= required sample size
-
= Z-score (1.96 for 95% confidence level)
-
= estimated population proportion
-
= margin of error
-
= total population size
Based on this formula, the final computed sample size for a finite population was approximately 491 students [30]. Accordingly, data were collected from bachelor-level students across Health, Engineering, Management, and Social Sciences disciplines. These students were enrolled in 14 distinct academic programs at constituent colleges of a public university, including: BBA, BBABI, BE Civil, BE Rural, BE Electronics, BE Computer, BE Software, B.Pharm, BSc Nursing, BMLT, BPH, Physiotherapy, BDevs, and BECS.
Collected data included prior academic performance such as high school grades, subject-wise scores (Physics, Chemistry, English, Accounts, Economics, etc.), average marks from the higher secondary level (+2), as well as demographic details like parental education and the type of school attended. Data were gathered through designated class representatives of each program.
In the second phase, internal assessment grades for the ongoing semester were collected. Final university examination results from the year 2022 were then matched with the internal records using VLOOKUP in Microsoft Excel, utilizing the university roll number as the primary key. To maintain ethical standards and ensure confidentiality, personally identifiable information such as student names and registration numbers were anonymized and excluded from the dataset.
This study seeks to evaluate and validate the performance of various AutoML-driven ensemble models in predicting bachelor-level academic outcomes for students in Health, Engineering, and related domains.