51视频

Mathematics and Statistics Vol. 13(5), pp. 279 - 296
DOI: 10.13189/ms.2025.130503
Reprint (PDF) (1731Kb)


Refining Wage Predictions with Machine Learning and Bayesian Optimization


Blerina Bo莽i 1, Aurora Simoni 2,*
1 Department of Mathematics, Faculty of Information Technology, University 'Aleksander Moisiu', Albania
2 Department of Applied Mathematics, Faculty of Natural Science, University of Tirana, Albania

ABSTRACT

Machine learning (ML) methods are essential in predictive modeling, where they use historical data to build algorithms capable of forecasting future outcomes. To achieve this, hyperparameter optimization is essential for selecting the best model configuration for a specific problem, aiming to minimize prediction error and improve performance. This research examined the performance of three machine learning regression models: support vector regression (SVR), extreme gradient boosting (XGBoost), and random forest (RF). Their effectiveness was measured using evaluation indicators, including mean squared error (MSE), mean absolute error (MAE), coefficient of determination (), and adjusted . Bayesian Optimization (BO) was applied to identify the optimal hyperparameters for the SVR, XGBoost, and RF models to enhance their predictive capabilities. The models were tested on a dataset from the Albanian Institute of Statistics (INSTAT), which included the average gross monthly wage per employee by group-occupations. To ensure robustness and avoid temporal leakage, we used time-aware cross-validation (TimeSeriesSplit) for model validation, which preserves the chronological structure of the dataset and better reflects real-world forecasting scenarios. In addition, we applied bootstrapped confidence intervals to all evaluation metrics on the test set, offering a more reliable assessment of model performance. These methodological choices enhance the statistical credibility of the results. Among the evaluated models, the Bayesian optimized SVR using the EI acquisition function delivered the highest predictive accuracy, achieving an value of 0.9955, an adjusted of 0.9936, and low error metrics (MAE = 0.0416, MSE = 0.0042). By employing BO for hyperparameter tuning, the SVR model demonstrated exceptional accuracy in predicting average gross monthly wages, showcasing its effectiveness in handling the dataset. These findings suggest that SVR, when optimized using BO, is a powerful tool for wage prediction tasks. Despite the limited number of features, this study demonstrates how Bayesian Optimization can still offer valuable improvements in model accuracy, especially in constrained real-world settings. However, further research is needed to determine whether these results generalize to other datasets and domains.

KEYWORDS
Machine Learning, BO, SVR, XGBoost, RF, Hyperparameter

Cite This Paper in IEEE or APA Citation Styles
(a). IEEE Format:
[1] Blerina Bo莽i , Aurora Simoni , "Refining Wage Predictions with Machine Learning and Bayesian Optimization," Mathematics and Statistics, Vol. 13, No. 5, pp. 279 - 296, 2025. DOI: 10.13189/ms.2025.130503.

(b). APA Format:
Blerina Bo莽i , Aurora Simoni (2025). Refining Wage Predictions with Machine Learning and Bayesian Optimization. Mathematics and Statistics, 13(5), 279 - 296. DOI: 10.13189/ms.2025.130503.