Evaluation of the performance of machine learning methods for estimating the maximum scour depth around the bandallike spur-dike

Document Type : Research Paper

Authors

1 Department of Hydraulic Structures, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran.

2 Department of Hydrology and Water Resources , Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran.

Abstract

In this study, the performance of machine learning-based methods for predicting the maximum scour depth around a Bandallike spur-dike is evaluated. For this purpose, three methods of Random Forest (RF) model, Support Vector Machine (SVM), and Gene Expression Programming (GEP) were used. To train and test the models, 108 data series (87 series for training and 21 series for testing) were extracted from the results of an experimental study. The models were evaluated with four different combinations of inputs (Fr: flow Froude number, S/L: ratio of distance to breakwater length, θ: spur-dike installation angle relative to the bank, and α: porosity of the permeable structure). The results showed that for all methods in the one input mode, the parameters with the most and least impact were, in order, α and S/L. In the SVM model, the average MAE index increased by about 2 times when the number of inputs increased from one input mode. In the GEP model, the average MAE index increased by about 3.5 times when the number of inputs increased from three to four inputs mode. However, in the RF method, increasing the number of inputs led to an increase in model accuracy, and the average MAE index decreased by 83% in the four inputs mode compared to the three inputs mode. Finally, it was found that the RF method had much better performance (MAE = 0.006 and RMSE = 0.009) in estimating the scour depth around the Bandal-like spur-dike than the other methods, and this model had less error spread with the same inputs.

Keywords

Main Subjects


Evaluation of the performance of machine learning methods for estimating the maximum scour depth around the bandallike spur-dike

 

EXTENDED ABSTRACT

Introduction

Various methods have been proposed to control riverbank erosion, with one of the most common methods being the use of a spur-dike. One of the key issues with these structures is the scouring around their foundations due to changes in flow patterns. Scouring around the foundation can lead to structural instability and ultimately its destruction. To reduce the depth of scouring around the foundation of the dam structure, a new type of spur-dike "BandalLike" has been introduced. This structure consists of a permeable part at the bottom and an impermeable part on top. Experimental relationships have been established to estimate the maximum depth of scouring around the foundation of the impermeable spur-dike to consider precautions in design. However, there are not many relationships provided for the permeable dam structure. Therefore, the aim of the current research is to evaluate the performance of machine learning-based methods to estimate the maximum depth of scouring around the "BandalLike" structure.

Methodology

In the current research, the results of a laboratory experiment were used. Dimensionless influential variables (input variables) on the scouring around the bandallike were considered, including Froude number (Fr), angle of installation of the bandallike relative to the flume wall (θ), percentage of permeability of the porous section (α), and the distance ratio to the length of the bandallike (S/L). It is worth mentioning that these variables were determined after dimensional analysis. Additionally, the ratio of maximum scouring depth to the flow depth (hs/h) was considered as the output variable. For each variable, a total of 108 data series were extracted. 80% of the data series (86 series) were used for model training, and the remaining 20% (22 series) were used for model evaluation. Furthermore, GEP, SVM, and RF methods were used in this research.

Results and Discussion

For all models, the best performance for single-variable scenarios was determined to be scenario S2 with input parameter α. In this regard, the MAE index for GEP, SVM, and RF models was estimated to be 0.077, 0.067, and 0.073, respectively. Additionally, the lowest performance level was determined for scenario S3 with input parameter S/L. For two-variable scenarios, the results showed that the best input combination for all models is scenario S5 with variables α and Fr. Based on this, the RMSE index for GEP, SVM, and RF models was determined to be 0.015, 0.015, and 0.009, respectively. With an increase in the number of inputs to two variables, the average MAE index for GEP and RF models decreased by 12% and 24%, respectively, indicating increased accuracy with an increase in inputs for these two models. However, the average MAE index for the SVM model increased by 12%, indicating a decrease in evaluation accuracy for this model with an increase in inputs. For three-variable scenarios, the GEP model achieved the highest accuracy for scenario S14 with parameters α, Fr, and θ, with RMSE, MAE, and CC statistical indices of 0.013, 0.017, and 0.990, respectively. In SVM and RF models, the best three-variable scenario was S11 with parameters α, Fr, S/L, with RMSE evaluation indices of 0.009 and 0.008, respectively. It was found that the RF model, like the GEP model, increased accuracy (a 41% decrease in average MAE index) with an increase in the number of input variables. However, increasing the number of inputs did not significantly change the accuracy of the SVM model. For four-variable scenarios (S15), the SVM model faced a significant decrease in accuracy, with an approximate 140% increase in the average MAE index. Meanwhile, the GEP and RF models also experienced increased accuracy for scenario S15.

Conclusion

All methods in single-variable input mode had the most and least impact respectively related to parameters α and S/L. In the SVM model, increasing the number of inputs from single-variable to two-variable mode resulted in an almost 2-fold increase in the average MAE index. In the GEP model, increasing the number of inputs from three variables to four variables led to an approximately 3.5-fold increase in the average MAE index. However, in the RF method, increasing the number of inputs resulted in improved model accuracy, with the average MAE index in the four-variable mode decreasing by 83% compared to the three-variable mode. Ultimately, it was evident that the RF method provided much better performance in estimating the depth of sedimentation around the bandallike spur-dike compared to other methods.

Abbaszadeh, H., Norouzi, R., Sume, V., Kuriqi, A., Daneshfaraz, R., & Abraham, J. (2023). Sill role effect on the flow characteristics (experimental and regression model analytical). Fluids, 8(8), 235.‏
Abbaszadeh, H., Daneshfaraz, R., Sume, V., & Abraham, J. (2024). Experimental investigation and application of soft computing models for predicting flow energy loss in arc-shaped constrictions. AQUA—Water Infrastructure, Ecosystems and Society, jws2024010.‏
Ahmadi, F., Mehdizadeh, S., Mohammadi, B., Pham, Q. B., Doan, T. N. C., & Vo, N. D. (2021). Application of an artificial intelligence technique enhanced with intelligent water drops for monthly reference evapotranspiration estimation. Agricultural Water Management, 244, 106622.
Azimi, H., Shabanlou, S., Ebtehaj, I., Bonakdari, H., & Kardar, S. (2017). Combination of computational fluid dynamics, adaptive neuro-fuzzy inference system, and genetic algorithm for predicting discharge coefficient of rectangular side orifices. Journal of Irrigation and Drainage Engineering, 143(7), 04017015.
Bagatur, T., & Onen, F. (2016). Computation of design coefficients in ogee-crested spillway structure using GEP and regression models. KSCE Journal of Civil Engineering, 20, 951-959.
Basser, H., Karami, H., Shamshirband, S., Jahangirzadeh, A., Akib, S., & Saboohi, H. (2014). Predicting optimum parameters of a protective spur dike using soft computing methodologies–A comparative study. Computers & Fluids, 97, 168-176.
Breiman, L. (2001). Random Forests. Machine Learning. 45(1): pp. 5–32.
Daneshfaraz, R., Norouzi, R., & Abbaszadeh, H. (2021). Numerical Investigation on Effective Parameters on Hydraulic Flows in Chimney Proportional Weirs. Iranian Journal of Soil and Water Research, 52(6), 1599-1616 (In Persian).
Dehghani, R., Torabi Poudeh, H., Younesi, H., & Shahinejad, B. (2020). Daily streamflow prediction using support vector machine-artificial flora (SVM-AF) hybrid model. Acta Geophysica, 68, 1763-1778.
Essam, Y., Huang, Y. F., Ng, J. L., Birima, A. H., Ahmed, A. N., & El-Shafie, A. (2022). Predicting streamflow in Peninsular Malaysia using support vector machine and deep learning algorithms. Scientific Reports, 12(1), 3883.
Ferreira, C. (2002). Genetic representation and genetic neutrality in gene expression programming. Advances in Complex System, (5)4. 389-408.
Ferreira, C. Gene expression programming: A new adaptive algorithm for solving problems. arXiv 2001, arXiv:cs/0102027.
Hassanzadeh, Y., & Abbaszadeh, H. (2023). Investigating Discharge Coefficient of Slide Gate-Sill Combination Using Expert Soft Computing Models. Journal of Hydraulic Structures, 9(1), 63-80.
He, S., Wu, J., Wang, D., & He, X. (2022). Predictive modeling of groundwater nitrate pollution and evaluating its main impact factors using random forest. Chemosphere, 290, 133388.
Katipoğlu, O. M., Yeşilyurt, S. N., Dalkılıç, H. Y., & Akar, F. (2023). Application of empirical mode decomposition, particle swarm optimization, and support vector machine methods to predict stream flows. Environmental Monitoring and Assessment, 195(9), 1108.
Kumar, S., Goyal, M.K., Deshpande, V., & Agarwal, M. (2023). Estimation of time dependent scour depth around circular bridge piers: Application of ensemble machine learning methods. Ocean Engineering, 270, 113611.
Kumar, V., Kedam, N., Sharma, K. V., Mehta, D. J., & Caloiero, T. (2023). Advanced machine learning techniques to improve hydrological prediction: a comparative analysis of streamflow prediction models. Water, 15(14), 2572.
Najafzadeh, M., Oliveto, s. (2021). More reliable predictions of clear-water scour depth at pile groups by robust artificial intelligence techniques while preserving physical consistency, Journal Soft Computing, 25:5723–5746.
Norouzi, H., Nadiri, A.A., Asghari Mogaddam, A., & Gharekhani, M. (2017). Prediction of Transmissivity of Malikan Plain Aquifer Using Random Forest Method. Water and Soil Science, 27(2), 61-75 (In Persian).
Nou, M. R. G., Foroudi, A., Latif, S. D., & Parsaie, A. (2022). Prognostication of scour around twin and three piers using efficient outlier robust extreme learning machine. Environmental Science and Pollution Research, 29(49), 74526-74539.
Pandey, M., Ahmad, Z., & Sharma, P. K. (2016). Estimation of maximum scour depth near a spur dike. Canadian Journal of Civil Engineering, 43(3), 270-278.
Pandey, M., Jamei, M., Ahmadianfar, I., Karbasi, M., Lodhi, A.S., & Chu, X. (2022). Assessment of scouring around spur dike in cohesive sediment mixtures: A comparative study on three rigorous machine learning models. Journal of Hydrology, 606, 127330.
Parisouj, P., Mohebzadeh, H., & Lee, T. (2020). Employing machine learning algorithms for streamflow prediction: a case study of four river basins with different climatic zones in the United States. Water Resources Managemen,. 34, 4113-4131.
Parsaie, A., Haghiabi, A. H., & Moradinejad, A. (2019). Prediction of scour depth below river pipeline using support vector machine. KSCE Journal of Civil Engineering, 23, 2503-2513.
Roushangar, K., Goodarzi, S., & Abbaszadeh, H. (2024). Numerical Investigation of the Performance of Blade Groynes on Scouring and its Effect on Hydraulic Parameters of Sediment and Flow. Environment and Water Engineering, 10(1), 121-136  (In Persian).
Shojaeian, Z., Kashefipour, S.M., Mosavi Jahromi, S.H., & Shafaee Bajestan, M. (2014). Experimental Study on The Local Scouring of Series of Bandal-Like spurs in Clear Water Condition. Journal of Irrigation Sciences and Engineering, 38(2), 21-32 (In Persian).
Teraguchi, H., Nakagawa, H., Kawaike, K., Bans, Y., & Zhang, H. (2011). Effects of hydraulicstructures on river morphological processes. International Journal of Sediment , 26(3), 283-303.
Tikhamarine, Y., Souag-Gamane, D., & Kisi, O. (2019). A new intelligent method for monthly streamflow prediction: hybrid wavelet support vector regression based on grey wolf optimizer (WSVR–GWO). Arabian Journal of Geosciences, 12, 1-20.
Tripathi, R. P., & Pandey, K. (2023). Gene-expression programming for scour around spur dike. International Journal of Hydrology Science and Technology, 15(3), 295-303.
Wang K., Wen, X., Hou, D., Tu, D., Zhu, N., Huang, P., & Zhang, H. (2018). Application of least-squares support vector machines for quantitative evaluation of known contaminant in water distribution system using online water quality parameters. Sensors, 18(4), 938.