Combining Outlier Robust Extreme Learning Machine (ORELM) with seasonal autoregressive moving average linear model (SARIMA) to improve the accuracy of runoff modeling

Nourmohammadi Dehbalaei, Fereshteh; Azari, Arash; Akhtari, Ali Akbar

doi:10.22059/ijswr.2024.382136.669788

Combining Outlier Robust Extreme Learning Machine (ORELM) with seasonal autoregressive moving average linear model (SARIMA) to improve the accuracy of runoff modeling

Document Type : Research Paper

Authors

¹ Department of Information Science, Faculty of Management, University of Tehran, Tehran,

² Assistant Professor, Department of Water Engineering, Razi University, kermanshah. Iran

³ Department of Civil Engineering, Razi University,Kermanshah, Iran Iran.

10.22059/ijswr.2024.382136.669788

Abstract

Accurate and reliable runoff forecasting has an important role in water resources management, but the complex nature of this parameter can create major challenges for the development of appropriate forecasting models. Two hybrid models based on the combination of two simple linear and non-linear models have been proposed for runoff modeling at hydrological station 02PL005 in the St. Lawrence River basin in Canada. Seasonal autoregressive moving average (SARIMA) linear model is proposed to address the linear and seasonal characteristics of runoff. While the artificial neural network (ANN) and Outlier Robust Extreme Learning Machine (ORELM) models have been used to deal with the nonlinear characteristics of the data through machine learning and pattern recognition. In order to increase the accuracy of the modeling, in the first stage of modeling, the normality and stationary of the data was examined, and by performing appropriate pre-processing, the data were prepared for modeling in the linear part. Then by defining different sub-scenarios and performing modeling through linear model, the best linear model was selected through different mathematical statistics including MAE, RMSE, R and AIC. In the final stage, the residuals of the linear model were modeled by two non-linear models including ANN and ORELM models. Comparing the results of the proposed hybrid models showed that the SARIMA-ORELM hybrid model with AIC=249.29, R=0.71, MAE=11.2 and RMSE=14.33 performs better than the SARIMA-MLP model in all mathematical criteria. Also, the results of the hybrid models were compared with the common MLP, ORELM and SARIMA models.

Keywords

Main Subjects

هیدرولوژی

EXTENDED ABSTRACT

Introduction

Runoff is an important factor in a hydrological system and is influenced by various factors such as geographic location, topography, and climate. Runoff forecasting plays an essential role in reducing the effects of floods and droughts, controlling erosion and sedimentation in the basin. Various hydrological models including empirical models, physical models and data-based models have been developed for runoff modeling. The data-driven methods due to the need for less knowledge of the physical behavior of the phenomenon have become more popular.

Materials and Methods

At first, the data were divided into two categories, training (70% of the total data measured) and testing (30% of the total data measured). The value of the Hurst coefficient for the data was 0.63, which indicates that the length of the time series is sufficient for modeling. The results of the normality and stationarity test showed that the data have a non-normal distribution and a non-stationary behavior. Therefore, by performing normalization through normalizing functions and removing definite terms from the time series by performing seasonal differentiation, the data were normalized and stationary. By defining two scenarios (without preprocessing and with preprocessing) and by performing different modeling, the best linear model was selected. By calculating the residual of the linear model and checking the independence of the residuals through the Ljung–Box test, nonlinear modeling was performed by outlier robust extreme learning machine (ORELM) and multilayer perceptron (MLP) models. Then, the output of the nonlinear model was summed with the linear model.

Results and Discussion

For linear modeling with SARIMA model, two scenarios were defined. The best linear model in the first scenario was obtained with MAE=13.28, RMSE=17.23, R=0.62 and AIC=267.54 using seasonal parameters and without preprocessing. In the second scenario, four sub-scenarios were implemented. Sub-scenario 4 using preprocessing through the standardization with MAE=12.76, RMSE=13.11, R=0.57 and AIC=264.41 shows better results than other sub-scenarios. The comparison of the results obtained from the implementation of different nonlinear models showed that model 6 with MAE=10.25, RMSE=13.48 and R=0.7 has the lowest error value and the highest correlation compared to other models. The comparison of the results obtained from the SARIMA-MLP models showed that model 4 with MAE=11.35, RMSE=14.67, AIC=254.41 and R=0.65 has the lowest error and the highest correlation as well as the least complexity compared to other combinations. Comparing the results obtained from the SARIMA-ORELM model showed that model 6 with AIC=249.29, R=0.71, MAE=11.2 and RMSE=14.33 has the best performance in terms of accuracy and complexity compared to other models. By comparing the statistical indicators, the best SARIMA-ORELM and SARIMA-MLP models were selected. The comparison of the results obtained from the implementation of different linear models through the two scenarios showed that preprocessing through standardization increases the accuracy of the model and reduces the complexity of the model.

Conclusion

A summary of the comparison of the results of the hybrid models with the results obtained from modeling through SARIMA and MLP models is given below:

The results of comparing the predictions of the models through statistical indicators show SARIMA-ORELM model performs better than SARIMA-MLP model in all mathematical criteria.

SARIMA-MLP and SARIMA-ORELM models reduced the complexity of the model by 4.9% and 6.8%, respectively, compared to the linear modeling mode without preprocessing.

Among the six different models selected for runoff modeling, the weakest performance in terms of error and complexity criteria is achieved by modeling through the SARIMA model without preprocessing.

Author Contributions

F.N.D.: Writing – original draft, Formal analysis, Conceptualization, Data curation, Methodology, Validation, Writing – review & editing. A.A.: Writing – original draft, Formal analysis, Conceptualization, Data curation, Methodology, Validation, Writing – review & editing. A.A.A.: Conceptualization, Data curation, Writing – review & editing.

Data Availability Statement

Data is available on reasonable request from the authors.

Acknowledgements

The authors would like to thank all participants of the present study.

Ethical considerations

The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest

The authors declare no conflict of interest

References

Azari, A., Zeynoddin, M., Ebtehaj, I., Sattar, A. M. A., Gharabaghi, B. and Bonakdari, H. 2021. Integrated preprocessing techniques with linear stochastic approaches in groundwater level forecasting. Acta Geophysica, 69, 1395–1411. https://doi.org/10.1007/s11600-021-00617-2.

Bayesteh, M., & Azari, A. (2019). Comparison of the performance of stochastic models in the generation of synthetic monthly flows data: A case study on Marun river. Journal of Applied Research in Water and Wastewater. 12, 117-125. https://doi.org/10.22126/arww.2019.1405.

Box, G. E. P., & Jenkins, G. (1970). Time series analysis: Forecasting and control (2nd ed.). San Francisco, CA: Holden-Day.

Dwivedi, D. K., & Shrivastava, P.K. (2019). Rainfall and runoff estimation of micro watersheds of coastal Navsari. Journal of Soil and Water Conservation 18(1): 43-51, January-March 2019. ISSN: 022-457X (Print); 2455-7145 (Online); https://doi.org/ 10.5958/2455-7145.2019.00005.5.

Ebtehaj, I., & Bonakdari, H. (2022). A reliable hybrid outlier robust non-tuned rapid machine learning model for multi-step ahead flood forecasting in Quebec, Canada, Journal of Hydrology, Volume 614, Part B, 2022, 128592, ISSN 0022-1694, https://doi.org/10.1016/j.jhydrol.2022.128592.

Ebtehaj, I., Bonakdari, H., & Gharabaghi, B. (2019). A reliable linear method for modeling lake level fluctuations. Journal of Hydrology. 570 (2019), 236-250. https://doi.org/10.1016/j.jhydrol.2019.01.010.

Ebtehaj, I., Bonakdari, H., Zeynoddin, M., Gharabaghi, B., & Azari, A. (2020). Evaluation of preprocessing techniques for improving the accuracy of stochastic rainfall forecast models. Int. Journal of Environment Science Technology. 17, 505–524. https://doi.org/10.1007/S13762-019-02361-Z.

Gelete, G. (2023). Application of hybrid machine learning-based ensemble techniques for rainfall-runoff modeling. Earth Sci Inform 16, 2475–2495. https://doi.org/10.1007/S12145-023-01041-4.

Huang, G. B., Zhu, Q. Y., & Siew, C. K. (2006). Extreme learning machine:theory and applications. Neurocomputing 70(1–3):489–501.

Jarque, C. M., & Bera, A. K. (1980). Efficient tests for normality, homoscedasticity and serial independence of regression residuals. Econ Lett. 6(3):255–259. https://doi.org/10.1016/0165-1765(80)90024-5.

Kwiatkowski, D., Phillips. P. C. B., Schmidt, P. & Shin, Y. (1992). Testing the null hypothesis of stationarity against the alternative of a unit root: How sure are we that economic time series have a unit root? J Econo. 54(1–3), 159–178. https ://doi.org/10.1016/0304-4076(92)90104 –Y.

Kim, T., Shin, J. Y., Kim, H., Kim, S., & Heo, J. H. (2019). The Use of Large-Scale Climate Indices in Monthly Reservoir Inflow Forecasting and Its Application on Time Series and Artificial Intelligence Models. Water. 2019, 11, 374. https://doi.org/10.3390/w11020374.

Lima, L. M. M., Popova, E., & Damien, P. (2014). Modeling and forecasting of Brazilian reservoir inflows via dynamic linear models. International Journal of Forecasting. 30 (2014) 464–476. https://doi.org/10.1016/j.ijforecast.2013.12.009.

Moeeni, H., Bonakdari, H., & Ebtehaj, I. (2017). Integrated SARIMA with Neuro-Fuzzy Systems and Neural Networks for Monthly Inflow Prediction. Water Resource Management. 31, 2141–2156 (2017). https://doi.org/10.1007/s11269-017-1632-7.

Nath, A., Mthethwa, F., & Saha, G. (2020). Runoff estimation using modified adaptive neuro-fuzzy inference system. Environment Engineering Res. 25(4), 545-553. https://doi.org/10.4491/eer.2019.166.

Niu, W. J., Feng, Z. K., Zeng, M., Feng, B., Min, Y. W., Cheng, C. T., & Zhou, J. Z. (2019). Forecasting reservoir monthly runoff via ensemble empirical mode decomposition and extreme learning machine optimized by an improved gravitational search algorithm. Applied Soft Computing. 82,105589. https://doi.org/10.1016/j.asoc.2019.105589.

Nourani, V., Najafi, H., Amini, A., & Tanaka, H. (2021). Using hybrid wavelet-exponential smoothing approach for streamflow modeling. Complexity. 1-17. https://doi.org/10.1155/2021/6611848.

Nourmohammadi Dehbalaei, F., Azari, A. & Akhtari, A. A. (2023). Development of a linear–nonlinear hybrid special model to predict monthly runoff in a catchment area and evaluate its performance with novel machine learning methods. Appl Water Sci 13, 118 (2023). https://doi.org/10.1007/s13201-023-01917-2.

Parsaie, A., Ghasemlounia, R., Gharehbaghi, A., Haghiabi, A., Chadee, A. A., Rashki Ghale Nou, M. (2024). Novel hybrid intelligence predictive model based on successive variational mode decomposition algorithm for monthly runoff series, Journal of Hydrology, Volume 634, 2024, 131041, ISSN 0022-1694, https://doi.org/10.1016/j.jhydrol.2024.131041.

Phillips, P. C. B., & Perron, P. (1988). Testing for a unit root in time series regression, Biometrika, 75(2), 335-46.

Salih, S. Q., Sharafati, A., Ebtehaj, I., Sanikhani, H., Siddique, R., Deo, R. C., Bonakdari, H., ShahidS., & Yaseen, Z. M. (2020). Integrative stochastic model standardization with genetic algorithm for rainfall pattern forecasting in tropical and semi-arid environments. Hydrological Sciences Journal. 65(2020), 7. https://doi.org/10.1080/02626667.2020.1734813.

Soltani, K., Azari, A., Zeynoddin, M., Amiri, A., Ebtehaj, I., Ouarda, T. B. M. J., Gharabaghi, B., & Bonakdari, H. (2021). Lake surface area forecasting using integrated satellite-sarima-long-short-term memory model. 04 August 2021, PREPRINT (Version 1). https://doi.org/10.21203/rs.3.rs-631247/v1.

Wang, W.C., Du, Y. J., Chau, K. W., Cheng, C. T., Xu, D. M. & Zhuang, WT. (2024). Evaluating the Performance of Several Data Preprocessing Methods Based on GRU in Forecasting Monthly Runoff Time Series. Water Resour Manage 38, 3135–3152. https://doi.org/10.1007/s11269-024-03806-y.

Zhang, K., & Luo, M., (2015). Outlier-robust extreme learning machine for regression problems, Neurocomputing 151 (2015) 1519-1527. https://doi.org/10.1016/j.neucom.2014.09.022.

Zhang, X., Wu, X., Zhu, G., Lu, X., & Wang, K. (2022). A seasonal ARIMA model based on the gravitational search algorithm (GSA) for runoff prediction. Water Supply 22 (8): 6959–6977. https://doi.org/10.2166/ws.2022.263.

Zhihua, L.V., Zuo, J., & Rodriguez, D. (2020). Predicting of Runoff Using an Optimized SWAT-ANN: A Case Study. Hydrology.29. https://doi.org/10.1016/j.ejrh.2020.100688.

Iranian Journal of Soil and Water Research

Article View: 448
PDF Download: 268

Combining Outlier Robust Extreme Learning Machine (ORELM) with seasonal autoregressive moving average linear model (SARIMA) to improve the accuracy of runoff modeling

Introduction

Materials and Methods

Results and Discussion

Conclusion

Author Contributions

Data Availability Statement

Acknowledgements

Ethical considerations

Conflict of interest

References

Volume 56, Issue 2
April 2025
Pages 331-349

Files

Share

How to cite

Statistics

Combining Outlier Robust Extreme Learning Machine (ORELM) with seasonal autoregressive moving average linear model (SARIMA) to improve the accuracy of runoff modeling

Introduction

Materials and Methods

Results and Discussion

Conclusion

Author Contributions

Data Availability Statement

Acknowledgements

Ethical considerations

Conflict of interest

References

Volume 56, Issue 2April 2025Pages 331-349

Files

Share

How to cite

Statistics

Volume 56, Issue 2
April 2025
Pages 331-349