Performance Evaluation of Machine Learning and Traditional Statistical Approaches in Bias Correction of CMIP6 Precipitation Data

Document Type : Research Paper

Authors

1 PHD Student in Water Resources Engineering, Water Science and Engineering, IKIU University, Qazvin, Iran

2 Associate Professor, Water Science and Engineering Department, Imam Khomeini International University (IKIU), Qazvin, Iran

3 Professor, Water Science and Engineering Dept., IKIU University, Qazvin, Iran

10.22059/ijswr.2025.402248.670007

Abstract

Global climate models (GCMs) are key tools for analyzing and predicting future climate trends. However, these models often exhibit systematic errors (bias) in simulating climatic parameters. In this study, the impact of traditional statistical bias correction methods Linear Scaling (LS) and Quantile Mapping (QM) and machine learning–based methods Extreme Gradient Boosting (XGBoost) and Long Short-Term Memory neural networks (LSTM) was evaluated to improve the performance of CMIP6 models from the NEX-GDDP dataset in predicting precipitation for the historical period (1961–2014) and the future period (2025–2100) in the Poldokhtar watershed, an area highly sensitive to hydrological variability and frequent destructive floods. The results indicated that raw climate model outputs exhibit significant bias and are unsuitable for direct hydrological applications. The LS method moderately reduced errors and improved performance indices, while QM increased RMSE (up to 10.3 mm) and decreased NSE (down to –2.5). Among machine learning methods, XGBoost achieved the highest accuracy with increases in r (up to 0.67), NSE (up to 0.44), and KGE (by more than 0.4), whereas LSTM effectively corrected systematic errors but was limited in reproducing variability and temporal correlation. These findings provide a valuable foundation for future-oriented climate change analyses and water resource management in the Poldokhtar basin.

Keywords

Main Subjects


Introduction

Increasing greenhouse gas concentrations and climate change significantly affect water resources and hydrological processes, while extreme events such as floods and droughts impose substantial economic and social impacts (Almazroui et al., 2020; Kim et al., 2021; Aryal et al., 2019; UNISDR, 2009; Jongman et al., 2012; Hirabayashi et al., 2021; Houshmand Kouchi et al., 2019). General Circulation Models (GCMs) are the primary tools for predicting climate change, but their coarse spatial resolution requires bias correction for local and regional studies (Sachindra et al., 2018; Taylor et al., 2012). The CMIP6 project, providing SSP and RCP scenarios, enables higher-accuracy simulations of future climate changes (Eyring et al., 2016; O’Neill et al., 2016). Traditional bias correction methods, such as Linear Scaling (LS) and Quantile Mapping (QM), effectively reduce systematic errors but have limitations in capturing nonlinear relationships and temporal dependencies (Shiru and Park, 2020; Jaiswal et al., 2022; Heshmati et al., 2025). Recently, machine learning algorithms, including XGBoost and LSTM, have shown improved performance in modeling complex precipitation patterns and providing more accurate bias corrections (Li et al., 2023; Tanimu et al., 2024). This study evaluates the performance of both traditional and machine learning-based bias correction methods on CMIP6 precipitation outputs under SSP scenarios in the Poldokhtar watershed, aiming to reduce uncertainty and improve hydrological predictions and water resources management.

Method

This study focuses on the Poldokhtar watershed, a sub-basin of the Karkheh River in western Iran, covering an area of 2,073 km², with the Kashkan River as its main stream. The data used include daily projections from General Circulation Models (GCMs) in the NEX-GDDP-CMIP6 dataset for both historical and future periods under SSP126, SSP245, SSP370, and SSP585 scenarios. Observed precipitation data from three meteorological stations within the watershed were used to correct biases in the GCM outputs. Three bias correction methods Linear Scaling (LS), Quantile Mapping (QM), and the machine learning algorithm XGBoost were evaluated. Additionally, a Long Short-Term Memory (LSTM) recurrent neural network was applied to capture long-term temporal dependencies and achieve more accurate precipitation adjustments. Model performance before and after bias correction was assessed using statistical metrics including MAE, RMSE, Pearson correlation coefficient, NSH, and KGE. Based on these metrics, GCMs were ranked at each station, and the cumulative ranks across the three stations were used to select the best-performing model for the watershed. This approach enables more accurate precipitation predictions and supports the analysis of climate change impacts on water resources.

Results

In this study, the performance of 23 CMIP6 climate models in simulating precipitation at three stations Poldokhtar, Afrineh, and Doab Veisian was evaluated. Results indicated that the raw model outputs exhibited considerable bias, with high RMSE and MAE values and low correlation and NSE, performing even worse than the simple observational mean. After applying bias correction methods, the linear scaling (LS) approach only slightly reduced errors and provided limited improvements in performance metrics, while the quantile mapping (QM) method generally worsened model performance across most indices. The XGBoost machine learning algorithm significantly reduced systematic errors and improved RMSE, MAE, NSE, and KGE, leading to the identification of selected models such as CanESM6, MIROC6, and GISS-E2-1-G. The LSTM approach also reduced mean errors but failed to reproduce variance and temporal correlation structures. Future precipitation projections using XGBoost-corrected CanESM6 indicated increases in winter and spring precipitation and notable monthly seasonal shifts, particularly under more severe SSP370 and SSP585 scenarios, while summer precipitation remained largely unchanged. These findings highlight that combining climate models with advanced machine learning techniques substantially enhances the reconstruction of historical precipitation and the assessment of climate change impacts.

Conclusions

In this study, the performance of 23 CMIP6 climate models in simulating precipitation at three stations—Poldokhtar, Afrineh, and Doab Veisian—was evaluated. Raw model outputs exhibited significant errors and low correlation. The LS method provided only slight improvements, while QM performed worse. Machine learning approaches, particularly XGBoost, substantially enhanced model accuracy by reducing MAE and RMSE and improving NSE and KGE. The XGBoost-corrected CanESM6 model showed the best performance and was used for future scenario projections. Predictions indicate increased precipitation in spring and autumn and slight decreases in winter, highlighting important applications for water resources management and climate change forecasting.

Author Contributions

“Conceptualization, M.F.K. and A.A.; methodology, M.F.K and A.A.; software, M.F.K.; validation, M.F.K. and A.A.; formal analysis, M.F.K., A.A. and H.R.E.; investigation, M.F.K., A.A. and H.R.E.; resources, X.X.; data curation, M.F.K and A.A.; writing original draft preparation, M.F.K.; writing review and editing, M.F.K and A.A.; visualization, M.F.K and A.A..; supervision, A.A. and H.R.E.; project administration, A.A.; funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.”

Data Availability Statement

Data available on request from the authors.

Acknowledgements

We sincerely thank all individuals who contributed to the completion of this research.

Ethical considerations

The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest

The author declares no conflict of interest.

Ahmed, K., Sachindra, D. A., Shahid, S., Demirel, M. C., & Chung, E. S. (2019). Selection of multi-model ensemble of general circulation models for the simulation of precipitation and maximum and minimum temperature based on spatial assessment metrics. Hydrology and Earth System Sciences. 23(11), 4803-4824.‌
Almazroui M, Saeed F, Saeed S, Islam MN, Ismail M, Klutse NAB, Siddiqui MH (2020) Projected change in temperature and precipitation over Africa from CMIP6. Earth Systems and Environment. 4(3):455–475
Aryal A, Shrestha S, Babel MS (2019) Quantifying the sources of uncertainty in an ensemble of hydrological climate-impact projections. Theoretical and Applied Climatology. 135(1/2):193–209
Avazpour, F., Hadian, M. R., Talebi, A., & Haghighi, A. T. (2025). Impact of climate change on river flow, using a hybrid model of LARS_WG and LSTM: A case study in the Kashkan Basin. Results in Engineering, 104956.‌
Chen, T. Guestrin, C. XGBoost: A Scalable Tree Boosting System. In KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794.
Eyring V, Bony S, Meehl GA, et al. (2016) Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization. Geoscientific Model Development 9:1937–58
Fang, G., Yang, J., Chen, Y. N., & Zammit, C. 2015. Comparing bias correction methods in downscaling meteorological variables for a hydrologic impact study in an arid area in China. Hydrology and Earth System Sciences, 19(6), 2547-2559.
Gupta H V, Kling H, Yilmaz K K, Martinez G F (2009) Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological Model ling. Journal of Hydrology 377(1-2):80-91‌
He, A., Wang, C., Xu, L., Wang, P., Wang, W., Chen, N., & Chen, Z. (2024). An evaluation of statistical and deep learning-based correction of monthly precipitation over the Yangtze River basin in China based on CMIP6 GCMs. Environment, Development and Sustainability, 1-21.‏
Heshamati, S., Nazari, B., & Nikoo, M. R. (2025). Enhancing accuracy in streamflow prediction under climate change scenarios based on an integrated machine learning–metaheuristic optimization approach. Journal of Water and Climate Change, jwc2025499.
Hirabayashi, Y., Tanoue, M., Sasaki, O., Zhou, X., & Yamazaki, D. (2021). Global exposure to flooding from the new CMIP6 climate model projections. Scientific reports, 11(1), 3740.‌
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.‌
Houshmand Kouchi D, Esmaili K, Faridhosseini A, Sanaei Nejad SH, Khalili D (2019) Simulation of climate change impacts using fifth assessment report models under RCP scenarios on water resources in the upper basin of Salman Farsi Dam. Iranian Journal of Irrigation and Drainage 2(13):243-258 (In Persian)
Hyndman R J, Koehler A B (2006) another look at measures of forecast accuracy. International Journal of Forecasting 22(4):679-688‌
Jaiswal, R., Mall, R. K., Singh, N., Lakshmi Kumar, T. V., & Niyogi, D. (2022). Evaluation of bias correction methods for regional climate models: Downscaled rainfall analysis over diverse agroclimatic zones of India. Earth and Space Science, 9(2), e2021EA001981.
Javan, K., & Azizzadeh, M. R. (2024). Evaluation of different bias correction methods and Projection of Future Precipitation Changes Using GFDL-ESM4 model in Lake Urmia basin. Journal of Geography and Planning, 28(88), 415-397.‏ (In Persian)
Jongman, B., Ward, P. J., & Aerts, J. C. (2012). Global exposure to river and coastal flooding: Long term trends and changes. Global Environmental Change, 22(4), 823-835.‌
Kim J H, Sung J H, Chung E S, Kim S U, Son M, Shiru M S (2021) Comparison of projection in meteorological and hydrological droughts in the Cheongmicheon Watershed for RCP4. 5 and SSP2-4.5. Sustainability 13(4):2066‌
Lenderink, G., Buishand, A., & Van Deursen, W. (2007). Estimates of future discharges of the river Rhine using two scenario methodologies: direct versus delta approach. Hydrology and Earth System Sciences, 11(3), 1145-1159.‌
Li, H., Zhang, Y., Lei, H., & Hao, X. (2023). Machine learning-based bias correction of precipitation measurements at high altitude. Remote Sensing, 15(8), 2180.‌
Maraun D, Wetterhall F, Ireson AM, Chandler RE, Kendon EJ, Widmann M, Brienen S, Rust HW, Sauter T, Themeßl M, et al. (2010) Precipitation downscaling under climate change: Recent developments to bridge the gap between dynamical models and the end user. Reviews of Geophysics 48(3):3003
Nash, J. E., & Sutcliffe, J. V. (1970). River flow forecasting through conceptual Model s part I—A discussion of principles. Journal of hydrology, 10(3), 282-290
Nurdiati, S., Sopaheluwakan, A., Pratama, Y. A., & Najib, M. K. (2021). Statistical bias correction on the climate model for el nino index prediction. Al-Jabar: Jurnal Pendidikan Matematika, 12(2), 273-282.
O’Neill BC, Tebaldi C, van Vuuren DP, et al. (2016) the scenario model intercomparison project (ScenarioMIP) for CMIP6. Geoscientific Model Development 9:3461–82
Panofsky, H. A., & Brier, G. W. (1958). Some applications of statistics to meteorology. Mineral Industries Extension Services, College of Mineral Industries, Pennsylvania State University.‌
Pearson K (1897) Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London 60(359-367):489-498‌
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830.‌
Sachindra DA, Ahmed K, Rashid M, Shahid S, Perera BJC (2018) Statistical downscaling of precipitation using machine learning techniques. Atmospheric Research 212:240–258
Seo, G. Y., & Ahn, J. B. (2023). Comparison of bias correction methods for summertime daily rainfall in South Korea using quantile mapping and machine learning model. Atmosphere, 14(7), 1057.‌
Shiru, M. S., & Park, I. (2020). Comparison of ensembles projections of rainfall from four bias correction methods over Nigeria. Water, 12(11), 3044.‌
Tanimu, B., Bello, A. A. D., Abdullahi, S. A., Ajibike, M. A., Yaseen, Z. M., Kamruzzaman, M., ... & Shahid, S. (2024). Comparison of conventional and machine learning methods for bias correcting CMIP6 rainfall and temperature in Nigeria. Theoretical and Applied Climatology, 155(6), 4423-4452.‌
Taylor KE, Stouffer RJ, Meehl GA (2012) An overview of CMIP5 and the experiment design. Bulletin of the American Meteorological Society 93(4):485–498
Thrasher, B., Wang, W., Michaelis, A., Melton, F., Lee, T., & Nemani, R. (2022). NASA global daily downscaled projections, CMIP6. Scientific data, 9(1), 262.
UNISDR. (2009). UNISDR Terminology on disaster risk reduction, United Nations International Strategy for Disaster Reduction (UNISDR), Geneva, Switzerland, 35 pp
Wang J, Hu L, Li D, Ren M (2020) Potential impacts of projected climate change under CMIP5 RCP scenarios on streamflow in the Wabash River Basin. Advances in Meteorology 2020:9698423
Wilby R, Wigley T (1997) Downscaling general circulation model output: A review of methods and limitations. Progress in Physical Geography: Earth and Environment 21(4):530–548
Wood, A. W., Maurer, E. P., Kumar, A., & Lettenmaier, D. P. (2002). Long‐range experimental hydrologic forecasting for the eastern United States. Journal of Geophysical Research: Atmospheres, 107(D20), ACL-6.‌
Zhang, L., Xue, B., Yan, Y., Wang, G., Sun, W., Li, Z., Shi, H. (2019). Model uncertainty analysis methods for semi-arid watersheds with different characteristics: a comparative SWAT case study. Water, 11(6), 1177.