Estimating the potassium grade of saline underground water using Sentinel satellite images and random forest algorithm (case study of Khoor and Biabank playa, Isfahan province)

Document Type : Research Paper

Authors

1 Department of Soil Science and Engineering, Faculty of Water and Soil Engineering, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran.

2 Department of Desert Management, Faculty of Pasture and Watershed Management, Gorgan University of Agricultural Sciences and Natural Resources, Gorgan, Iran.

3 Department of Archaeology, Faculty of Humanities, Higher Education Institute of Architecture and Arts, Tehran, Iran.

Abstract

One of the widely used elements that plays an important role in sustainable agricultural production is potassium.The potassium in the surface soil of the playa originates from the potassium present in the underground water. As a result, there is a correlation between the surface soil potassium and the potassium grade of the groundwater. The aim of this research is to utilize a combination of the random forest (RF) algorithm and satellite imagery to establish the relationship between soil surface potassium and remote sensing indicators. This will enable the prediction of the potassium grade of the underground in Khoor and Biabank playa in Isfahan province. For this purpose, 60 soil samples were taken from the 0-5 cm layer to measure potassium in the surface layer(dependent variable). In order to determine the sampling coordinates, the Latin supercube method was used. Twelve boreholes were drilled to extract and measure the potassium grade of underground saline water. The 12 bands of the‌ Sentinel-2 satellite and four main mathematical operations were used to define the index (independent variables) to model the potassium content of the surface soil layer and ultimately estimate the rate of potassium grade in the underground saline water. The data were categorized into two groups: 70% for calibration (training) and 30% for validation (testing). The data were modeled using the RF algorithm in the Google Colab environment and implemented with the Python programming language. The results of this algorithm were obtained with R2, MSE, RMSE and MAE statistical indices of 0.51, 0.0179, 0.1338 and 0.1130 respectively. The results of this research confirm the effectiveness of remote sensing data and machine learning algorithms in predicting the potassium grade of saline groundwater.

Keywords

Main Subjects


Estimating the potassium grade of saline underground water using Sentinel satellite images and random forest algorithm
(case study of Khoor and Biabank playa, Isfahan province)

 

EXTENDED ABSTRACT

Introduction:

In recent decades, with the increase in population growth and the growing need to produce more food, today more than 90% of potassium production is used as fertilizer. One of the main sources of potassium fertilizers is saline water underground. One of the main mineral elements in saline water underground is potassium, which is found in the playa. Due to the environmental conditions of the playa, there is a lot of evaporation and it leads to the precipitation of soluble solutes on the surface. By examining these sediments, it is possible to determine areas with grade high potassium levels for extracting salt water underground in the playa but the complex climatic conditions that govern it make field measurements to estimate the grade of difficult. One of the new methods to estimate mineral resources is the combined use of machine learning algorithm and remote sensing.

Objective:

The main purpose of this research is to use remote sensing and random forest algorithm to estimate the surface potassium of playa soil and to evaluate the relationship between potassium, and index of satellite images to estimate the grade potassium saline water underground , which is the innovation of this research compared to other previous researches.

 Materials and method:

In this research, the use of remote sensing and random forest algorithm was used to estimate the surface potassium of playa soil and to evaluate the relationship between potassium, and the index of satellite images to estimate the potassium grade of underground saline water.For this purpose, 60 samples of surface layer potassium (dependent variable) were sampled from the 0 - 5 cm layer using Latin hypercube method. Also, in 12 drilling boreholes, the potassium grade of saline water was measured in December 1400. because there was no related satellite index that has a high correlation with soil surface potassium. By using 4 basic arithmetic operations (addition - subtraction - multiplication and division) between SENTINEL 2 satellite image bands and by writing a new code (specific to the study) 61 million times, the code was executed with different combinations to produce new index. A regression model was used to estimate potassium grade of underground saline water , which was converted to the potassium grade of underground salinewith a potassium equation of the surface layer.The Sentinel 2 satellite image and the resulting indicators from this satellite (independent variables) were used to predict the potassium of the surface layer and finally estimate the potassium grade of underground saline water Also, Permutation Feature Importance (PFI) method was used in the RF model to prioritize and select parameters for modeling. The data were divided into two categories: 70% for calibration (training) and 30% for validation (testing) and were implemented in the random forest model in the Python programming environment.

Results and discussion:

results of the actual measured values and the predicted values of surface potassium with the RF model is based on the statistical indicators of the evaluation of the ML models including R2, MSE, RMSE and MAE The results of the model showed that the calibration data with R2 equal to 0.88 and MSE, RMSE and MAE equal to 0.0039, 0.0624 and 0.0460, respectively, as well as statistical indicators of R2, MSE, RMSE and MAE for the validation data of the model It is 0.51, 0.0179, 0.1338 and 0.1130 respectively. The results show that Index 3, Index 2, Index 4, Index 5 have the greatest effect on the estimation of soil surface potassium and potassium grade of saline water and Index 15, Index 14, Index 11 and Index 12 have the least effect.

Conclusion:

         Random forest algorithm by combining remote sensing technology with prioritizing effective indicators and finding meaningful relationships between variables and specifying important parameters as an efficient tool for extensive mapping of large areas for cases where predicting an important variable in the traditional way due to Spatial diversity. And when it is difficult and expensive to predict them, it will be very efficient and it will make it very easy to determine the parameters and prepare the map with a short period of time and spending much less money. Considering that there are many playas in the country that have potassium resources, determining the most important parameters with machine learning technology and remote sensing is a useful tool in managers' decision making in order to invest in drilling in promising areas for saline water underground extraction It has an effective role. Since the conditions of the playa in the are not very different, it is possible that the results of this research can be generalized to other playas in the desert to determine the potassium grade of saline water in the desert, in which case it is possible to estimate the potassium grade of saline water in the will be playa using satellite images.

Al Rawashdeh, R., Xavier-Oliveira, E., & Maxwell, P. (2016). The potash market and its future prospects. Resources Policy47, 154-163.
Bandak S, Movhedei naeani A, Komaki C, kakooei M, Verrlest J. (2023). Predicting and Mapping Soil Organic Carbon Using Remote Sensing and Machine Learning Algorithms. jwss; 27(3):17-34. (inPersian)
Bolt, G. H., & Bruggenwert, M. G. M. (1978). Soil chemistry. A. Basic elements. Elsevier Scientific Publishing Company.
Boroh, A. W., Lawou, S. K., Mfenjou, M. L., & Ngounouno, I. (2022). Comparison of geostatistical and machine learning models for predicting geochemical concentration of iron: case of the Nkout iron deposit (south Cameroon). Journal of African Earth Sciences195, 104662.
Breiman, L. (2001). Random forests. Machine learning45, 5-32.
Chatterjee, S., & Bandopadhyay, S. (2011). Goodnews Bay Platinum resource estimation using least squares support vector regression with selection of input space dimension and hyperparameters. Natural Resources Research20, 117-129.
Devore, J. L. (2015). Probability and Statistics for Engineering and the Sciences. Cengage Learning.
Dutta, S., Bandopadhyay, S., Ganguli, R., & Misra, D. (2010). Machine learning algorithms and their application to ore reserve estimation of sparse and imprecise data. Journal of Intelligent Learning Systems and Applications2(02), 86-96.
Estefan, G., Sommer, R., & Ryan, J. (2013). Methods of soil, plant, and water analysis. A manual for the West Asia and North Africa region3, 65-119.
Geranian, H., Tabatabaei, S. H., Asadi, H. H., & Carranza, E. J. M. (2016). Application of discriminant analysis and support vector machine in mapping gold potential areas for further drilling in the Sari-Gunay gold deposit, NW Iran. Natural Resources Research25, 145-159.
Ghezelbash, R., Maghsoudi, A., & Carranza, E. J. M. (2019). Performance evaluation of RBF-and SVM-based machine learning algorithms for predictive mineral prospectivity modeling: integration of SA multifractal model and mineralization controls. Earth Science Informatics12, 277-293.
Goldberg-Yehuda, N., Assouline, S., Mau, Y., & Nachshon, U. (2022). Compaction effects on evaporation and salt precipitation in drying porous media. Hydrology and Earth System Sciences26(9), 2499-2517.
Goswami, A. D., Mishra, M. K., & Patra, D. (2016, October). Adapting pattern recognition approach for uncertainty assessment in the geologic resource estimation for Indian iron ore mines. In 2016 International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES) (pp. 1816-1821). IEEE.
Harris, J. R., & Grunsky, E. C. (2015). Predictive lithological mapping of Canada's North using Random Forest classification applied to geophysical and geochemical data. Computers & geosciences80, 9-25.
Hewson, R., Robson, D., Carlton, A., & Gilmore, P. (2017). Geological application of ASTER remote sensing within sparsely outcropping terrain, Central New South Wales, Australia. Cogent Geoscience3(1), 1319259.
Iraji, M., Movahedi naeini, S. A., Komaki, C. B., Ebrahimi, S., & Yaghmaei, B. (2023). Evaluation of effective parameters for predicting the potassium grade of saline water by using support vector machine and random forest algorithms (case study: playa of Khoor and Biabank area city, Isfahan province). Iranian Journal of Soil and Water Research, (). (inPersian)
Jafrasteh, B., Fathianpour, N., & Suárez, A. (2018). Comparison of machine learning methods for copper ore grade estimation. Computational Geosciences22, 1371-1388.
Jalloh, A. B., Kyuro, S., Jalloh, Y., & Barrie, A. K. (2016). Integrating artificial neural networks and geostatistics for optimum 3D geological block modeling in mineral reserve estimation: A case study. International Journal of Mining Science and Technology26(4), 581-585.
Kaneko, H. (2022). Cross‐validated permutation feature importance considering correlation between features. Analytical Science Advances3(9-10), 278-287.
Kisi, O., Karahan, M. E., & Şen, Z. (2006). River suspended sediment modelling using a fuzzy logic approach. Hydrological Processes: An International Journal20(20), 4351-4362.
Leite, E. P., & de Souza Filho, C. R. (2009). Artificial neural networks applied to mineral potential mapping for copper‐gold mineralizations in the Carajás Mineral Province, Brazil. Geophysical Prospecting57(6), 1049-1065.
Lewkowski, C., Porwal, A., & González-Álvarez, I. (2010, May). Genetic programming applied to base-metal prospectivity mapping in the Aravalli Province, India. In EGU general assembly conference abstracts (p. 523).
Maleki, S., Ramazia, H. R., & Moradi, S. (2014). Estimation of Iron concentration by using a support vector machineand an artificial neural network-the case study of the Choghart deposit southeast of Yazd, Yazd, Iran. Geopersia4(2), 201-212.
McKay, G., & Harris, J. R. (2016). Comparison of the data-driven random forests model and a knowledge-driven method for mineral prospectivity mapping: A case study for gold deposits around the Huritz Group and Nueltin Suite, Nunavut, Canada. Natural Resources Research25(2), 125-143.
Minasny, B., & McBratney, A. B. (2006). A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & geosciences32(9), 1378-1388.
Moghaddam, D. D., Rahmati, O., Panahi, M., Tiefenbacher, J., Darabi, H., Haghizadeh, A., ... & Bui, D. T. (2020). The effect of sample size on different machine learning models for groundwater potential mapping in mountain bedrock aquifers. Catena187, 104421.
Mohammadi, N. M., & Hezarkhani, A. (2018). Application of support vector machine for the separation of mineralised zones in the Takht-e-Gonbad porphyry deposit, SE Iran. Journal of African Earth Sciences143, 301-308.
Moriasi, D. N., Arnold, J. G., Van Liew, M. W., Bingner, R. L., Harmel, R. D., & Veith, T. L. (2007). Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Transactions of the ASABE, 50(3), 885-900.
Mousavi, S. R., Parsayi, F., Rahmani, A., Sedri, M. H., & kohsar Bostani, M. (2020). Spatial Prediction Some of the Surface Soil Properties Using Interpolation and Machine Learning Models. Journal of Soil Management and Sustainable Production10(3), 27-49. (inPersian)
Mousavi, S. R., Sarmadian, F., Omid, M., & Bogaert, P. (2022). Application of Machine Learning Models in Spatial Estimation of Soil Phosphorus and Potassium in Some Parts of Abyek Plain. Iranian Journal of Soil Research35(4), 397-411. (inPersian)
Mousavi, Seyyed Ali, Naqvi, Rana, and Shojaei, Saeed (2015). Industrial use of shourabeh and playa for extracting potassium and searching for it among several playas in Iran. New Ideas in Science and Technology, 1(2) (inPersian)
Naimi, S., Ayoubi, S., Di Raimo, L. A. D. L., & Dematte, J. A. M. (2022). Quantification of some intrinsic soil properties using proximal sensing in arid lands: Application of Vis-NIR, MIR, and pXRF spectroscopy. Geoderma Regional28, e00484.
Nezamolhosseini, S. A., Mojtahedzadeh, S. H., & Gholamnejad, J. (2017). The application of artificial neural networks to ore reserve estimation at choghart iron ore deposit.
Nicholson, K. (Ed.). (1997). Manganese mineralization: Geochemistry and mineralogy of terrestrial and marine deposits. Geological Society of London.
Nwaila, G. T., Zhang, S. E., Frimmel, H. E., Manzi, M. S., Dohm, C., Durrheim, R. J., ... & Tolmay, L. (2020). Local and target exploration of conglomerate-hosted gold deposits using machine learning algorithms: a case study of the Witwatersrand gold ores, South Africa. Natural Resources Research29, 135-159.
Oh, H. J., & Lee, S. (2010). Application of artificial neural network for gold–silver deposits potential mapping: A case study of Korea. Natural Resources Research19, 103-124.
Porwal, A., Carranza, E. J. M., & Hale, M. (2003). Artificial neural networks for mineral-potential mapping: a case study from Aravalli Province, Western India. Natural resources research12, 155-171.
Radwin, M. H., & Bowen, B. B. (2021). Mapping mineralogy in evaporite basins through time using multispectral Landsat data: Examples from the Bonneville basin, Utah, USA. Earth Surface Processes and Landforms46(6), 1160-1176.
Rigol-Sanchez, J. P., Chica-Olmo, M., & Abarca-Hernandez, F. (2003). Artificial neural networks as a tool for mineral potential mapping with GIS. International Journal of Remote Sensing24(5), 1151-1156.
Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., & Chica-Rivas, M. J. O. G. R. (2015). Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews71, 804-818.
Saadat, Mohammad., and Zamani Borojni, Farsad. (2015). A Review of Methods to Improve the Efficiency of Random Forest Technique, First National Conference on Information, Communication and Computing Technology. (inPersian)
Sadeghi, M. & Ahmadi Nadoushan, M. (2022). Modeling soil nitrogen using remote sensing, regression and random forest models, water and soil resources conservation, 11(2): 97-111. (inPersian)
Safaee, S., & Wang, J. (2020). Towards global mapping of salt pans and salt playas using Landsat imagery: A case study of western United States. International Journal of Remote Sensing41(22), 8693-8716.
Sass, O., & Viles, H. (2022). Heritage hydrology: a conceptual framework for understanding water fluxes and storage in built and rock-hewn heritage. Heritage Science10(1), 66.
Schnitzler, N., Ross, P. S., & Gloaguen, E. (2019). Using machine learning to estimate a key missing geochemical variable in mining exploration: Application of the Random Forest algorithm to multi-sensor core logging data. Journal of Geochemical Exploration205, 106344.
Shaw, P. A., & Bryant, R. G. (2011). Pans, playas and salt lakes. Arid zone geomorphology: process, form and change in drylands, 373-401.
Stout, J. E. (2022). Playa dynamics and salinity: a study of yellow lake on the high plains of Texas. The Texas Journal of Science74(1), Article-6.
Vabalas, A., Gowen, E., Poliakoff, E., & Casson, A. J. (2019). Machine learning algorithm validation with a limited sample size. PloS one14(11), e0224365.
Wang, C., Chen, J., Wu, J., Tang, Y., Shi, P., Black, T. A., & Zhu, K. (2017). A snow-free vegetation index for improved monitoring of vegetation spring green-up date in deciduous ecosystems. Remote sensing of environment196, 1-12
Wang, J., Zuo, R., & Xiong, Y. (2020). Mapping mineral prospectivity via semi-supervised random forest. Natural Resources Research29, 189-202.
Wang, Z. (2021). Research on desert water management and desert control. European Journal of Remote Sensing54(sup2), 42-54.
Yang, R. M., Zhang, G. L., Liu, F., Lu, Y. Y., Yang, F., Yang, F., ... & Li, D. C. (2016). Comparison of boosted regression tree and random forest models for mapping topsoil organic carbon concentration in an alpine ecosystem. Ecological indicators60, 870-878.
Zhang, Y., Sui, B., Shen, H., & Ouyang, L. (2019). Mapping stocks of soil total nitrogen using remote sensing data: A comparison of random forest models with different predictors. Computers and Electronics in Agriculture160, 23-30.