آموزش و آزمون الگوریتم جنگل تصادفی برای پایش تغییرات شوری خاک باغات پسته

نوع مقاله : مقاله پژوهشی

نویسندگان

1 استادیار پژوهش، مرکز تحقیقات و آموزش کشاورزی و منابع طبیعی خراسان رضوی

2 عضو هیئت علمی مرکز ملی تحقیقات شوری، سازمان تحقیقات، آموزش و ترویج کشاورزی، یزد، ایران

3 عضو هیات علمی مرکز ملی تحقیقات شوری، سازمان تحقیقات، آموزش و ترویج کشاورزی، یزد، ایران.

4 دانش آموخته کارشناسی ارشد باغبانی، دانشگاه ایالتی کالیفرنیا، دیویس، یزد، ایران

5 مرکز ملی تحقیقات شوری، سازمان تحقیقات، آموزش و ترویج کشاورزی ، یزد، ایران

چکیده

تعداد 720 نمونه خاک که از 240 موقعیت نمونه‌برداری در داخل باغات پسته از دو استان یزد و خراسان رضوی جمع‌آوری شده بود برای آموزش و آزمون الگوریتم جنگل تصادفی مورد استفاده قرار گرفتند. متغیرهای کمکی که برای این مدل‌سازی مورد استفاده قرار گرفتند عبارت بودند از مقادیر میانه 32 متغیر بازتاب سطحی و شاخص‌های طیفی مستخرج از تصاویر ماهواره‌ای سنتینل 2 از اول مارس تا اول اکتبر سال نمونه‌برداری بود که توسط پلتفرم گوگل ارث انجین برای نقاط نمونه‌برداری استخراج شده بود. مدل جنگل تصادفی از طریق کدنویسی در محیط R توسعه و بهینه‌سازی شد. برای اعتبارسنجی مدل از روش کنارگذاشتن یک نقطه استفاده شد. پس از شناسایی و حذف نقاط پرت، تعداد 191 نقطه برای آموزش و آزمون مجدد الگوریتم جنگل تصادفی باقی ماند. مقدار RMSE در دسته آموزش 1/1 و در دسته آزمون 6/2 دسی‌زیمنس بر متر بود . مقدار R2 در هر دو دسته 93 درصد بود. این الگوریتم برای پیش‌بینی تغییرات شوری خاک در مناطق و سال‌های مورد مطالعه استفاده شد. بر اساس این نتایج در پروژه زهکشی تنور لاهور، مساحت اراضی کلاس 3 در حال کاهش و اراضی کلاس 2 در حال افزایش است. در مزرعه مرتاضیه شوری کلاس 3 کاهش یافته ولی در عوض به کلاس 4 افزوده شده است. در مزرعه رضایی نیز به شدت اراضی کلاس 4 در حال افزایش است. همچنین در مزرعه دادیار در استان خراسان رضوی نیز بطور نسبی اراضی کلاس 4 شوری افزایش یافته است. نتایج این تحقیق نشان داد که الگوریتم جنگل تصادفی بطور موفقیت‌آمیزی قادر به پیش‌بینی تغییرات شوری خاک در محدوده‌های مورد مطالعه و مناطق مشابه می‌باشد. 

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Random Forest Algorithm Training and Test for Monitoring Soil Salinity Changes in Pistachio Orchards

نویسندگان [English]

  • Yousef Hasheminejhad 1
  • Farhad Dehghany 2
  • Hossein Beyrami 3
  • Morad Mortaz 4
  • Mehdi Shiran Tafti 5
1 Professor Assistant, Khorasan Razavi Agricultural and Natural Resources Research and Education Center.
2 Faculty member, National Salinity Research Center (NSRC), Agricultural Research, Education ‎and Extension ‎Organization (AREEO), Yazd, Iran
3 Faculty member, ‎National Salinity Research Center (NSRC), Agricultural Research, Education and Extension ‎Organization (AREEO), Yazd, Iran
4 MSc. in Horticulture, UC. Davis, Yazd, Iran.
5 National Salinity Research Center, Agricultural Research, Education and Extension Organization. Yazd, Iran.
چکیده [English]

A total of 720 soil samples collected from 240 sampling locations within pistachio orchards in the provinces of Yazd and Khorasan Razavi were used for training and testing the Random Forest algorithm. The auxiliary variables used in this modeling included the median values of 32 surface reflectance variables and spectral indices derived from Sentinel-2 satellite images, covering the period from March 1 to October 1 of the sampling year. These variables were extracted for the sampling points using the Google Earth Engine platform. The Random Forest model was developed and optimized through coding in the R environment. The Leave-One-Out Cross-Validation (LOOCV) method was used to validate the model. After identifying and removing outliers, 191 points remained for retraining and retesting the Random Forest algorithm. The RMSE was 1.1 dS/m in the training set and 2.6 dS/m in the test set. The R² value was 93% in both datasets. This algorithm was then used to predict changes in soil salinity in the studied areas and years. Based on the results, in the Tanour Lahour drainage project, Class 3 of salinity is decreasing while Class 2 land area is increasing. In the Mortazieh farm, Class 3 salinity has decreased but shifted to Class 4. In the Rezaei farm, Class 4 land area is significantly increasing. Likewise, in the Dadyar farm in Khorasan Razavi Province, Class 4 salinity area has relatively increased. The results of this study demonstrated that the Random Forest algorithm is capable of successfully predicting soil salinity changes within the study areas and similar regions.

کلیدواژه‌ها [English]

  • Google Earth Engine (GEE)
  • Sentinel-2
  • Soil Salinity
  • Spectral Indices
  • Validation

Introduction

Given the limitations of traditional soil salinity assessment methods, alternative techniques such as remote sensing (RS) have been used to predict soil salinity and sodicity in unsampled areas. Vegetation indices (VIs), as indirect indicators of salinity, have been used to assess soil salinity through their negative effects on crop growth and plant stress. On the other hand, salinity indices, which serve as direct indicators of salinity, highlight the spectral reflectance of salt crusts on the soil surface, especially in the visible and near-infrared (NIR) regions of the electromagnetic spectrum. The scientific literature includes many successful estimations of soil properties using various soil sensing technologies. Despite significant advances in developing soil salinity prediction models, such models have not been widely developed for pistachio orchards, which typically lack full vegetation cover. Therefore, this study aimed to develop a Random Forest model to predict soil salinity using auxiliary variables derived from satellite imagery.

Materials and Methods

For this study, 669 soil samples collected from 223 sampling locations within four large pistachio-growing areas in Yazd and Khorasan Razavi provinces were used to train and test the Random Forest algorithm. At each sampling location, soil was collected down to a depth of 90 cm in 30 cm increments. The ground-truth data corresponded to the years 2018, 2022, and 2023. To train the Random Forest algorithm, spectral reflectance values and vegetation indices derived from Sentinel-2 satellite images were used. The surface reflectance values and spectral indices for the sampling points were extracted using the Google Earth Engine (GEE) platform. The Random Forest algorithm was trained, tested, and validated using code written in the R programming environment. The spectral indices used in this study were important variables commonly used as auxiliary inputs for soil salinity modeling.

Results and Discussion

According to the correlation matrix, although there was no strong correlation between the main variable (soil salinity) and individual auxiliary variables, some of the auxiliary variables showed positive or negative correlations with one another. Due to multicollinearity among the auxiliary variables, stepwise regression models could not be used to determine the relationship between soil salinity and these variables. The Random Forest model was trained using 500 decision trees. The RMSE of the model was 1.10 dS/m for the training set and 2.59 dS/m for the testing set. The R² value was 0.932 for both sets. The spectral indices used in this study are related to vegetation stress conditions and thus can also reflect plant performance under stress. The algorithm was used to predict changes in soil salinity across the study regions and years. The results showed that in the Tanour Lahour drainage project, the area of Class 3 salinity land is decreasing while Class 2 land is increasing. In Mortazieh farm, Class 3 salinity has declined but shifted toward Class 4. In Rezaei farm, Class 4 salinity has increased significantly. Similarly, in Dadyar farm in Khorasan Razavi province, Class 4 salinity area has relatively increased. These findings indicate that the Random Forest algorithm is capable of successfully predicting soil salinity changes in the study areas and similar regions.

Conclusion

The results of this study demonstrated that the Random Forest model, utilizing variables extracted from Sentinel-2 satellite imagery, has a high capability for predicting soil salinity changes in pistachio orchards in Yazd and Khorasan Razavi provinces. The spatial analysis of salinity changes also showed that in certain areas, such as the Tanour Lahour drainage project, salinity has decreased, indicating the positive effects of management practices. In contrast, in areas like Rezaei farm, the increasing trend of salinity is alarming. These results highlight the importance of continuous monitoring and the use of advanced tools such as the integration of remote sensing data and machine learning algorithms for effective soil salinity management. This approach can also be applied in other saline agricultural regions to prevent further soil degradation and yield reduction.

Author Contributions

Conceptualization, Y. H. and F. D.; methodology, Y.H. and H.B.; software, Y.H.; validation, H. B., M. S. and M.M.; formal analysis, Y.H.; investigation, Y.H., F.D. and H.B.; resources, F.D. and M.M.; data curation, Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, Y.H., F.D. and H.B.; visualization, Y.H.; supervision, Y.H.; project administration, Y.H.. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Not applicable.

Acknowledgements

The authors would like to thank the National Salinity Research Center for support of the present study.

Ethical considerations

The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest

The author declares no conflict of interest.

Afshar, F. A., Ayoubi, S., Besalatpour, A. A., Khademi, H., and Castrignano, A. (2016). Integrating auxiliary data and geophysical techniques for the estimation of soil clay content using CHAID algorithm. Journal of Applied Geophysics, 126, 87-97.
Allbed, A., and Kumar, L. (2013). Soil salinity mapping and monitoring in arid and semi-arid regions using remote sensing technology: a review. Advances in remote sensing, 2013.
Allbed, A., Kumar, L., & Aldakheel, Y. Y. (2014). Assessing soil salinity using soil salinity and vegetation indices derived from IKONOS high-spatial resolution imageries: Applications in a date palm dominated region. Geoderma, 230, 1-8.
Arsoy, S., Ozgur, M., Keskin, E., and Yilmaz, C. (2013). Enhancing TDR based water content measurements by ANN in sandy soils. Geoderma, 195, 133-144.
Brunner, P. H. T. L., Li, H. T., Kinzelbach, W., and Li, W. P. (2007). Generating soil electrical conductivity maps at regional level by integrating measurements on the ground and remote sensing data. International Journal of Remote Sensing, 28(15), 3341-3361.
Chen, Y., Qiu, Y., Zhang, Z., Zhang, J., Chen, C., Han, J., & Liu, D. (2020). Estimating salt content of vegetated soil at different depths with Sentinel-2 data. PeerJ, 8, e10585.
Cho, K. H., Beon, M.-S., & Jeong, J.-C. (2018). Dynamics of soil salinity and vegetation in a reclaimed area in Saemangeum, Republic of Korea. Geoderma, 321, 42-51.
Dehghani, F., Rahnemaie, R., Dalir, N., Saadat, S., & Zarebanadkouki, M. (2023). Interactive effect of salinity and Ca to Mg ratio of irrigation water on pistachio growth parameters and its ionic composition in a calcareous soil. New Zealand Journal of Crop and Horticultural Science51(3), 432-450.
Dong, W., Wu, T., Luo, J., Sun, Y., and Xia, L. (2019). Land parcel-based digital soil mapping of soil nutrient properties in an alluvial-diluvia plain agricultural area in China. Geoderma, 340, 234-248.
FAO. (2024).‌Global status of salt-affected soils‌–‌Main report. Rome.https://doi.org/10.4060/cd3044en.
Fourati, H. T., Bouaziz, M., Benzina, M., & Bouaziz, S. (2015). Modeling of soil salinity within a semi-arid region using spectral analysis. Arabian Journal of Geosciences, 8(12), 11175-11182.
Fourati, T.H., Bouaziz, M., Benzina, M., and Bouaziz, S. (2017). Detection of terrain indices related to soil salinity and mapping salt-affected soils using remote sensing and geostatistical techniques. Environmental monitoring and assessment, 189, 1-11.
Ghassemi, F., Jakeman, A. J., and Nix, H. A. (1995). Salinisation of land and water resources: human causes, extent, management and case studies: CAB international.
Gholizadeh, A., Žižala, D., Saberioon, M., and Borůvka, L. (2018). Soil organic carbon and texture retrieving and mapping using proximal, airborne and Sentinel-2 spectral imaging. Remote Sensing of Environment, 218, 89-103.
Gomez, C., and Coulouma, G. (2018). Importance of the spatial extent for using soil properties estimated by laboratory VNIR/SWIR spectroscopy: Examples of the clay and calcium carbonate content. Geoderma, 330, 244-253.
Gorji, T., Sertel, E., & Tanik, A. (2017). Monitoring soil salinity via remote sensing technology under data scarce conditions: A case study from Turkey. Ecological indicators74, 384-391.
Haq, Y. U., Shahbaz, M., Asif, S., Ouahada, K., & Hamam, H. (2023). Identification of soil types and salinity using MODIS terra data and machine learning techniques in multiple regions of Pakistan. Sensors23(19), 8121.
Hasheminejhad, Y. (2011). Irrigation Management under Saline Conditions Using Wetting Front Detector (WFD). Iranian Journal of Soil Research24(3), 265-272. doi: 10.22092/ijsr.2011.126640
Hassani, A., Azapagic, A., and Shokri, N. (2020). Predicting long-term dynamics of soil salinity and sodicity on a global scale. Proceedings of the National Academy of Sciences, 117(52), 33017-33027.
Hazaymeh, K., & Hassan, Q. K. (2017). A remote sensing-based agricultural drought indicator and its implementation over a semi-arid region, Jordan. Journal of Arid Land, 9(3), 319-330.
Hijmans, R. J. (2020). terra: Spatial data analysis. CRAN: Contributed Packages.
Kuhn, M. (2008). “Building Predictive Models in R Using the caret Package.” Journal of Statistical Software, 28(5), 1– 26.  doi:10.18637/jss.v028.i05https://www.jstatsoft.org/index.php/jss/article/view/v028i05.
Lagacherie, P., Arrouays, D., Bourennane, H., Gomez, C., and Nkuba-Kasanda, L. (2020). Analysing the impact of soil spatial sampling on the performances of Digital Soil Mapping models and their evaluation: A numerical experiment on Quantile Random Forest using clay contents obtained from Vis-NIR-SWIR hyperspectral imagery. Geoderma, 375, 114503.
Liaw A. and Wiener M. (2002). “Classification and Regression by randomForest.” R News, 2(3), 18-22. https://CRAN.R-project.org/doc/Rnews/.
Lin, Z. Q., and Banuelos, G. S. (2015). Soil salination indicators. Environmental indicators, 319-330.
Mehdi-Tounsi, H., Chelli-Chaabouni, A., Mahjoub-Boujnah, D., & Boukhris, M. (2017). Long-term field response of pistachio to irrigation water salinity. Agricultural Water Management185, 1-12.
Mulder, V. L., De Bruin, S., Schaepman, M. E., and Mayr, T. R. (2011). The use of remote sensing in soil and terrain mapping—A review. Geoderma, 162(1-2), 1-19.
Pebesma, E., & Bivand, R. (2023). Spatial Data Science: With Applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016
Peng, J., Biswas, A., Jiang, Q., Zhao, R., Hu, J., Hu, B., and Shi, Z. (2019). Estimating soil salinity from remote sensing and terrain data in southern Xinjiang Province, China. Geoderma, 337, 1309-1319.
Qiu, Y., Chen, C., Han, J., Wang, X., Wei, S., & Zhang, Z. (2019). Satellite remote sensing estimation model of soil salinity in Jiefangzha irrigation under vegetation coverage. Water Saving Irrigation, 44, 108-112.
Ren, D., Wei, B., Xu, X., Engel, B., Li, G., Huang, Q., Xiong, Y. and Huang, G. (2019). Analyzing spatiotemporal characteristics of soil salinity in arid irrigated agro-ecosystems using integrated approaches. Geoderma, 356, 113935.
Ritzema, H. P. (2016). Drain for Gain: Managing salinity in irrigated lands—A review. Agricultural Water Management176, 18-28.
Rossel, R. V., Adamchuk, V. I., Sudduth, K. A., McKenzie, N. J., and Lobsey, C. (2011). Proximal soil sensing: An effective approach for soil measurements in space and time. Advances in agronomy, 113, 243-291.
Scudiero, E., Skaggs, T. H., and Corwin, D. L. (2014). Regional scale soil salinity evaluation using Landsat 7, western San Joaquin Valley, California, USA. Geoderma Regional, 2, 82-90.
Seyedmohammadi, J., Navidi, M. N., Zeinadini, A., & McDowell, R. W. (2024). Random forest, an efficient smart technique for analyzing the influence of soil properties on pistachio yield. Environment, Development and Sustainability26(1), 2615-2636.
Shamsi, S., Kamali, A., & Hasheminejhad, Y. (2022). An approach to predict soil salinity changes in irrigated pistachio orchards (Ardakan, Yazd Province): A case study. Dry Land Soil Research (DLSR)1(1), 1-10.
Taghadosi, M. M., Hasanlou, M., & Eftekhari, K. (2019). Soil salinity mapping using dual-polarized SAR Sentinel-1 imagery. International journal of remote sensing40(1), 237-252.
Taghizadeh-Mehrjardi, R., Schmidt, K., Toomanian, N., Heung, B., Behrens, T., Mosavi, A., Band, S.S., Amirian-Chakan, A., Fathabadi, A. and Scholten, T. (2021). Improving the spatial prediction of soil salinity in arid regions using wavelet transformation and support vector regression models. Geoderma, 383, 114793.
Wang, N., Chen, S., Huang, J., Frappart, F., Taghizadeh, R., Zhang, X., Wigneron, J.P., Xue, J., Xiao, Y., Peng, J. and Shi, Z. (2024). Global Soil Salinity Estimation at 10 m Using Multi-Source Remote Sensing. Journal of Remote Sensing, 4, 0130.
Wickham, H. (2015). dplyr: A grammar of data manipulation. R package version 04.3, p156.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M. (2019). Welcome to the tidyverse. Journal of open source software, 4(43), 1686.doi:10.21105/joss.01686.
Xi, W. F., Jiang, Q. W., & Yang, A. M. (2024). Using stepwise regression to address multicollinearity is not appropriate. International Journal of Surgery110(5), 3122-3123.
Xiao, C., Ji, Q., Chen, J., Zhang, F., Li, Y., Fan, J., Hou, X., Yan, F. and Wang, H. (2023). Prediction of soil salinity parameters using machine learning models in an arid region of northwest China. Computers and Electronics in Agriculture, 204, 107512.
Yue, J., Tian, J., Tian, Q., Xu, K., & Xu, N. (2019). Development of soil moisture indices from differences in water absorption between shortwave-infrared bands. Isprs Journal of Photogrammetry and Remote Sensing, 154, 216-230.
Zare, E., Ahmed, M. F., Malik, R. S., Subasinghe, R., Huang, J., and Triantafilis, J. (2018). Comparing traditional and digital soil mapping at a district scale using residual maximum likelihood analysis. Soil Research, 56(5), 535-547.
Zare, S., Abtahi, A., Shamsi, S. R. F., and Lagacherie, P. (2021). Combining laboratory measurements and proximal soil sensing data in digital soil mapping approaches. Catena, 207, 105702.
Zhao, S., Ayoubi, S., Mousavi, S.R., Mireei, S.A., Shahpouri, F., Wu, S.X., Chen, C.B., Zhao, Z.Y. and Tian, C.Y. (2024). Integrating proximal soil sensing data and environmental variables to enhance the prediction accuracy for soil salinity and sodicity in a region of Xinjiang Province, China. Journal of Environmental Management, 364, 121311.