نوع مقاله : مقاله پژوهشی
نویسندگان
1 گروه مهندسی آبیاری و آبادانی، دانشکده کشاورزی، دانشگاه تهران، تهران، ایران
2 آبیاری و آبادانی، دانشکده کشاورزی، دانشگاه تهران، کرج، ایران
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسندگان [English]
Accurate prediction of groundwater levels is of great importance in water resource management, especially in arid regions. This research, with the aim of replacing traditional models with machine learning methods, has examined two algorithms: Support Vector Machine and Random Forest for predicting groundwater levels. Modeling was conducted using 20 years of data on precipitation, air temperature, evaporation, groundwater extraction, and groundwater level as input variables. After examining the normality and correlation of the data, 70% of the data were used for training and 30% for testing. The evaluation results of the R², RMSE, MAE, and MSE metrics showed that the Support Vector Machine -RBF algorithm had values of 0.57, 1.05, 0.61, and 1.11 in the training phase, and values of 0.74, 0.84, 0.61, and 0.71 in the testing phase, respectively. The Random Forest algorithm, using all hyperparameters, provided values of 0.85, 0.61, 0.44, and 0.37 in the training phase and 0.71, 0.93, 0.66, and 0.86 in the testing phase, and showed better performance due to its high accuracy and resistance to multicollinearity. Additionally, using the Permutation Feature Importance method, the number of input variables for the Random Forest model was reduced from six to one, and its results, without a significant decrease in model accuracy, included values of 0.83, 0.66, 0.45, and 0.43 in the training phase and 0.71, 0.93, 0.66, and 0.86 in the testing phase. The research findings indicate that machine learning models, particularly the Random Forest algorithm, can be a suitable alternative to traditional methods for predicting groundwater levels and managing water resources sustainably.
کلیدواژهها [English]
Groundwater is considered one of the most vital sources of drinking and agricultural water, especially in arid and semi-arid regions like Iran, accounting for nearly 55 percent of the country's annual water consumption. However, various factors such as population growth and hydrogeological and meteorological changes have caused an imbalance in the extraction and recharge of aquifers, leading to a decrease in groundwater levels and an increase in related crises. These crises include reduced natural recharge, resource pollution, and ecosystem instability, which threaten food security and economic development. In this context, sustainable management of groundwater resources requires tools that can accurately predict water level changes and facilitate planning and decision-making.
The use of artificial intelligence algorithms, such as Support Vector Machines (SVM) and Random Forests (RF), in predicting groundwater levels has proven highly efficient due to their ability to analyze complex patterns and nonlinear environments. These models are more flexible than traditional numerical methods and perform well even with limited data. Various studies, including the research of Yoon et al., Gupta et al., and Cordoi Milan in Iran, demonstrate the high accuracy of these algorithms in simulating groundwater level fluctuations. Specifically, the RF model has shown a strong capability in analyzing large datasets and reducing overfitting, and it has been identified as one of the most effective tools for predicting groundwater resources.
This study is designed to identify the most accurate regression model for predicting groundwater levels and specifically compares the performance of two advanced machine learning algorithms, Random Forest (RF) and Support Vector Machine (SVM). Using monthly historical data, this research aims to analyze the complex patterns of groundwater level fluctuations and provide a model with higher predictive accuracy. The importance of this issue is particularly pronounced in arid and semi-arid regions, where water crises and climate change exert greater pressure on groundwater resources. The results of this comparison can not only improve the accuracy of predictions but also aid in developing sustainable management strategies and informed decision-making for the protection and optimal utilization of groundwater resources.
Parts of Ardabil Province in northwest Iran—including the 4,704 square kilometer Ardabil Plain—are the subject of this study. Covering around 90% of the 969 square kilometer Ardabil Plain, the aquifer has an average depth of 53 meters. From October 1998 to September 2020, meteorological data, groundwater levels, and well information were gathered. These records comprise precipitation, evaporation, temperature, groundwater extraction, and groundwater levels over several time periods. Support Vector Machine (SVM) and Random Forest (RF) models were applied to replicate the groundwater level; the data was randomly split into training (70%) and testing (30%). Using statistical measures such as the R² correlation coefficient, root mean square error (RMSE), mean absolute error (MAE), and mean square error (MSE), one can assess the performance of the models.
In this study, two machine learning algorithms, SVM and RF, were used to predict the groundwater level in the Ardabil plain. The results showed that the RF model performed better than the SVM; during the testing phase, the performance indicators of RF (R², RMSE, MAE, and MSE) improved compared to SVM, providing higher prediction accuracy. The RF model outperformed the SVM model due to its resistance to multicollinearity and its ability to manage complex data, while the SVM was sensitive to multicollinearity. Additionally, RF was able to enhance prediction accuracy by reducing the number of inputs from 5 to 2 variables. These results emphasize the effectiveness of the Random Forest algorithm in predicting groundwater levels and managing groundwater resources, and they indicate that for higher accuracy, the use of spatial data and more complex models in future research will be beneficial.
All authors contributed equally to the conceptualization of the article and the writing of the original and subsequent drafts.
The authors avoided data fabrication, falsification, plagiarism, and misconduct.
The author declares no conflict of interest.