The accuracy of machine learning models relies on hyperparameter tuning
Published in Computational Sciences
Hyperparameter Tuning and Model Optimization
Hyperparameters play a pivotal role in determining the predictive performance of machine learning models. They help balance overfitting and underfitting by adjusting the influence of research-independent features, thereby preventing extreme model behaviors. Both manual tuning and automated techniques are employed to identify the optimal permutation and combination of hyperparameters to enhance model accuracy.
This study investigates hyperparameter optimization using various tuning methods, including Logistic Regression, Random Forest, Randomized Search, Grid Search, Genetic Algorithm, Bayesian Optimization, and Optuna. The primary goal was to identify the model with the highest predictive accuracy for student grade classification.
Model performance was evaluated using confusion matrices and Receiver Operating Characteristic–Area Under the Curve (ROC-AUC) curves. Among all tuning methods, the Genetic Algorithm achieved the highest classification accuracy (82.5%) and AUC-ROC score (90%). Manual tuning—using an estimator of 300, entropy as the criterion, square root for max features, and a minimum leaf sample of 10—yielded 81.1% accuracy, closely matching the performance of the Randomized Search Cross-Validation algorithm. The default Random Forest model recorded the lowest accuracy at 78%.
Although Grid Search achieved high accuracy, it required significantly longer execution time (941.5 seconds) compared to manual tuning (3.66 seconds). These findings highlight the importance of selecting efficient hyperparameter tuning techniques for optimizing machine learning models in student grade prediction tasks.
Methods and Data Preparation
The central aim of this research is to validate and compare machine learning model accuracy by separating target (dependent) variables from independent features. After receiving ethical approval from the Faculty of Science and Technology at IIS University, sample data were collected from Pokhara University, Nepal.
Sample size estimation was performed using Cochran’s formula (Cochran, 1977), which is suitable for large populations:
Where:
-
is the required sample size,
-
is the estimated population proportion (0.5),
-
,
-
is the z-score corresponding to the 95% confidence level (1.96),
-
is the desired level of precision (0.05).
Applying the formula, the calculated sample size was 376 for passed students. An equal number of failed student records were added, resulting in a total of 752 students from 14 academic programs, including health sciences, engineering, and management, using data from the fiscal year 2022. After removing missing and incomplete records, the final dataset consisted of 711 student records.
As a preliminary step, logistic regression analysis was performed to examine relationships between dependent and independent variables. The regression model provided confidence intervals, standard errors, t-statistics, and p-values for each feature, allowing interpretation of statistical significance.
The coefficient of determination (R²) was 0.3, indicating that 30% of the variability in student outcomes could be explained by the independent variables. The F-statistic was significant (p = 1.51e-47), suggesting that the overall model was statistically reliable. Coefficients measured the change in the dependent variable resulting from a one-unit change in each independent variable, assuming other variables remain constant. The t-statistics tested the null hypothesis that the coefficients are zero, and the p-values indicated the probability of observing such t-statistics under the null hypothesis. Smaller p-values pointed to stronger evidence against the null hypothesis.
The omnibus test further assessed the skewness and kurtosis of the model residuals to ensure the validity of the regression assumptions.