Document Type : Research Paper
Authors
Department of Environmental Engineering, Faculty of Water and Environmental Engineering, Shahid Chamran University of Ahvaz, Ahvaz, Iran.
Abstract
Keywords
Main Subjects
EXTENDED ABSTRACT
In today's world, water resources have attracted much attention due to their unique importance. These resources are of great value as one of the vital bases for human life, environmental protection and economic development. With population increase, climate change and human pressures, water resources are facing many challenges and threats, especially in dry areas. These challenges include reducing water quality and quantity, destroying water resources, and creating serious problems for freshwater consumption. Therefore, the importance of investigating and sustainable management of water resources is of particular importance. In this regard, the use of artificial intelligence methods, especially machine learning, is increasingly used in predicting and modelling water quality and water resources management. Due to their ability to detect patterns and complex relationships in water quality data, these methods are considered effective tool for improving water quality management and maintenance.
The present study examines the water quality of the Maroon River, one of the most important rivers in Iran, which plays an important role in the development of urban and rural areas. The data used include parameters such as temperature, biochemical oxygen demand, phosphate... for 10 years have been collected from different stations. In the first step, these data have been used as inputs for forecasting models. Then, dimension reduction methods such as factor analysis have been used to extract important features. In the next step, different machine learning algorithms such as Linear Regression،Random Forest، Extra Trees وLight Gradient Boosting Machine have been used to predict the water quality index, and the performance of the algorithms was evaluated using criteria such as root mean square error and coefficient of determination.
The p-value of Bartlett's test in this research was close to zero and it can be concluded that there is a significant correlation between the variables and the data are suitable for factor analysis and dimension reduction. The values of the variance inflation coefficient for the water quality parameters used in this research showed that total coliform and phosphate variables have little colinearity with other independent variables. The prediction results of the water quality index using the 8 studied parameters as input showed that the random forest and regression algorithms showed the highest and lowest agreement with the real data, respectively. Because the regression algorithm uses a straight line to predict the dependent variable's values based on the independent variables and performs poorly in complex problems with non-linear interactions. The results also showed that nitrate is the most important input parameter and acidity is less important for the three studied algorithms.
By combining the insights obtained from factor analysis and feature importance analysis, researchers can better understand the complex relationships between water quality parameters and create more effective strategies for water management and pollution control.
Fereshteh Sayahi: Design, Analysis, and Interpretation of data Writing- Original draft preparation, Visualization. Laleh Divband Hafshejani: Conceptualization, Methodology, Design, Revision of the manuscript and Editing. Parvaneh Tishehzan: Design, Revision of the manuscript and Editing. Hamid Abdolabadi: Analysis and Interpretation of data.
Data can be sent from the corresponding author by email upon request.
We are grateful to the Research Council of Shahid Chamran University of Ahvaz for financial support (GN SCU.WE1402.47794).
The authors avoided data fabrication, falsification, plagiarism, and misconduct.