Digital modeling and prediction of soil subgroup classes using deep learning approach in a part of arid and semi-arid lands of Qazvin Plain

Document Type : Research Paper


1 Department of Soil Science, College of Agriculture and Natural Resource, University of Tehran, Karaj, Iran

2 soil science department< faculty of agricultural engineering and technology, university of Tehran

3 Department of Remote Sensing and Photogrammetry, Geospatial and Surveying Faculty, College of Engineering, University of Tehran, Tehran, Iran.


Soil class maps contain useful information that helps stakeholders to understand soil behavior in response to different management programs. As well as, their numerical prediction is dependent on the appropriate scale of environmental variables. Therefore, the current research intends to use the deep learning approach (CNN) and the spatial information of geomorphometric attributes and the sentinel 1/2 satellite images along with band ratios to predict the soil subgroup classes with its uncertainty map. Also, comparing the results of CNN and the random forest (RF) model in prediction of soil classes and different environmental variables was not well documented.
Material and Methods
CNN model was runed in the Google Collaboratory online environment and the RF model was performed by the "rf" function in the "caret" package in the RStudio environment. The models were calibrated with 80% of the data set along with six different window sizes and validated according to 20% of rest data based on two indices of overall accuracy (OA) and F1-Score.
Results and Discussion
Six covariates i.e., DEM, SWI, WE, SH, MRVBF, DIFF were selected as the most effective variables among 33 geomorphometric attributs, with 12 individual bands and the indices of sentinel 1/2. Totally, 13 soil subgroups including nine from Aridisols, three Inceptisols subgroups and, one Entisols subgroup are recognized in the study area. The overall accuracy for two models with a slightly difference of 7% in the window size (15*15) was observed with 43% and 50% for CNN and RF models, respectively. The CNN model has three patterns (increasing-decreasing), small and large optimal window size, and the same pattern observed in the scaled RF model, too. The OA was zero in all window sizes for the Sodic Xeric Calcigypsids subgroup in the CNN model and the Xeric Calcigypsids, and Typic Xerorthents subgroups in the RF model. In addition, the Xeric Haplocalcids and Xeric Haplogypsids only predicted by the RF model in 3*3 and 5*5 window size, respectively. By increasing the window size from three to nine, and 15, the Typic Calcixerepts shows a mild increasing trend in the F1-Score and also a mild decreasing trend after reaching the peak. The amount of F1-score for Typic Calcixerepts in CNN and RF models was 69% and 77%, respectively. The F1-Score values of Gypsic Aquisalids and Xeric Haplogypsids increase by 30% and 17%, by increasing the window size from three to five, and immediately a sharp downward trend, which indicates the appropriateness of the small window size in order to predict.
     In general, despite the limited number of observation profiles (n=278), the CNN model provides an acceptable prediction in mapping the soil subgroup classes, and although a slight difference in the overall accuracy with the RF model, while, the CNN presents a lower uncertainty map in comparison to RF. In future studies, this model and its procedure can be used to predict soil class maps in other arid and semi-arid regions.


ایستگاه سینوپتیک قزوین، 1349-1399، سازمان هواشناسی ایران.
بی همتا؛ محمدرضا، زارع چاهوکی؛ محمد علی. (1389). اصول آمار در  علوم مرتع داری. ویراش سوم. تهران. انتشارات دانشگاه تهران. 300صفحه.
سازمان زمین شناسی ایران، 1374. نقشه چهارگوش زمین شناسی. شماره 111. زمین شناسی.
موسوی؛ سیدروح اله، سرمدیان؛ فریدون، رحمانی؛ اصغر. (1398). مدل‌سازی و پیش‌بینی مکانی کلاس خاک با استفاده از الگوریتم یادگیری رگرسیون درختی توسعه‌یافته و جنگل­های تصادفی در بخشی از اراضی دشت قزوین. تحقیقات آب و خاک ایران, 50(10), 2525-2538. doi: 10.22059/ijswr.2019.280905.668198.
Behrens, T., Schmidt, K., Ramirez-Lopez, L., Gallant, J., Zhu, A. X., & Scholten, T. (2014). Hyper-scale digital soil
 mapping and soil formation analysis. Geoderma, 213(1), 578-588.
Bihamta, M., & Zare-chahooki, M. (2011). Principles of Statistics in range sciences. 3rd Ed. Tehran,
University of Tehran Press: 300. (In Persian).
Behrens, T., Zhu, A. X., Schmidt, K., & Scholten, T. (2010). Multi-scale digital terrain analysis and feature selection for digital soil mapping. Geoderma155(3-4), 175-185.
Beucher, A., Rasmussen, C. B., Moeslund, T. B., & Greve, M. H. (2022). Interpretation of convolutional neural  networks for acid sulfate soil classification. Frontiers in Environmental Science, 679(9), 1-14.
Brungard, C. W., Boettinger, J. L., Duniway, M. C., Wills, S. A., & Edwards Jr, T. C. (2015). Machine learning for predicting soil classes in three semi-arid landscapes. Geoderma, 239(1), 68-83.
Cavazzi, S., Corstanje, R., Mayr, T., Hannam, J., & Fealy, R. (2013). Are fine resolution digital elevation models
always the best choice in digital soil mapping?. Geoderma, 195(1), 111-121.
Chaney, N. W., Wood, E. F., McBratney, A. B., Hempel, J. W., Nauman, T. W., Brungard, C. W., & Odgers, N. P. (2016). POLARIS: A 30-meter probabilistic soil series map of the contiguous United States. Geoderma, 274(15), 54-67.
Conrad, O., Bechtel, B., Bock, M., Dietrich, H., Fischer, E., Gerlitz, L., ... & Böhner, J. (2015). System for automated geoscientific analyses (SAGA) v. 2.1. 4. Geoscientific Model Development8(7), 1991-2007.
Dane, J. H., & Topp, C. G. (Eds.). (2020). Methods of soil analysis, Part 4: Physical methods (Vol. 20). John Wiley & Sons.
Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., & Lautenbach, S. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography36(1), 27-46.
Duan, M., & Zhang, X. (2021). Using remote sensing to identify soil types based on multiscale image texture
features. Computers and Electronics in Agriculture, 187(1), 106272.
Esfandiarpour-Boroujeni, I., Shahini-Shamsabadi, M., Shirani, H., Mosleh, Z., Bagheri-Bodaghabadi, M., & Salehi, M. H. (2020). Assessment of different digital soil mapping methods for prediction of soil classes in the Shahrekord plain, Central Iran. Catena, 193(1), 104648.
Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment, 202(1), 18-27.
Geological Survey of Iran, 1995. Geological Quadrangle Map. No111.Geology. (inPersian)
Gallant, J. C., & Dowling, T. I. (2003). A multiresolution index of valley bottom flatness for mapping depositional areas. Water resources research39(12).
Heung, B., Hodúl, M., & Schmidt, M. G. (2017). Comparing the use of training data derived from legacy soil pits and  soil survey polygons for mapping soil classes. Geoderma290, 51-68.
Heung, B., Ho, H. C., Zhang, J., Knudby, A., Bulmer, C. E., & Schmidt, M. G. (2016). An overview and comparison of machine-learning techniques for classification purposes in digital soil mapping. Geoderma265, 62-77.
Jamali, A., Mahdianpari, M., Mohammadimanesh, F., & Homayouni, S. (2022). A deep learning framework based on generative adversarial networks and vision transformer for complex wetland classification using limited training samples. International Journal of Applied Earth Observation and Geoinformation115, 103095.
Jafari, A., Khademi, H., Finke, P. A., Van de Wauw, J., & Ayoubi, S. (2014). Spatial prediction of soil great groups by boosted regression trees using a limited point dataset in an arid region, southeastern Iran. Geoderma232, 148-163.
Jiang, Z. D., Owens, P. R., Zhang, C. L., Brye, K. R., Weindorf, D. C., Adhikari, K., & Wang, Q. B. (2021). Towards a dynamic soil survey: Identifying and delineating soil horizons in-situ using deep learning. Geoderma401, 115341.
Jensen, J. R. 2015. Introductory digital image processing: a remote sensing perspective (No. Ed. 4). Prentice-Hall Inc.
Kuhn, M., Wing, J., Weston, S., Williams, A., Keefer, & C., Engelhardt, A., Team, R. C. (2020). Package ‘caret’. The R Journal, 223, 7.
Kwak, G. H., Park, C. W., Lee, K. D., Na, S. I., Ahn, H. Y., & Park, N. W. (2021). Potential of hybrid CNN-RF model for early crop mapping with limited input data. Remote Sensing13(9), 1629.
Lin, X., Li, C., Zhang, Y., Su, B., Fan, M., & Wei, H. (2017). Selecting feature subsets based on SVM-RFE and the overlapping ratio with applications in bioinformatics. Molecules23(1), 52:1-10.
Maynard, J. J., Salley, S. W., Beaudette, D. E., & Herrick, J. E. (2020). Numerical soil classification supports soil identification by citizen scientists using limited, simple soil observations. Soil Science Society of America Journal84(5), 1675-1692.
McBratney, A. B., Santos, M. M., & Minasny, B. (2003). On digital soil mapping. Geoderma117(1-2), 3-52.
Mirakzehi, K., Pahlavan-Rad, M. R., Shahriari, A., & Bameri, A. (2018). Digital soil mapping of deltaic soils: A case of study from Hirmand (Helmand) river delta. Geoderma313, 233-240.
Miller, B. A., Koszinski, S., Wehrhan, M., & Sommer, M. (2015). Impact of multi-scale predictor selection for modeling soil properties. Geoderma239, 97-106.
Mousavi, S. R., Sarmadian, F., & Rahmani, A. (2020). Modelling and Prediction of Soil Classes Using Boosting Regression Tree and Random Forests Machine Learning Algorithms in Some Part of Qazvin Plain. Iranian Journal of Soil and Water Research, 50(10), 2525-2538. https://doi: 10.22059/ijswr.2019.280905.668198.
Mousavi, S. R., Sarmadian, F., Omid, M., & Bogaert, P. (2022). Three-dimensional mapping of soil organic carbon using soil and environmental covariates in an arid and semi-arid region of Iran. Measurement, 201, 111706.
Neyestani, M., Sarmadian, F., Jafari, A., Keshavarzi, A., & Sharififar, A. (2021). Digital mapping of soil classes using spatial extrapolation with imbalanced data. Geoderma Regional, 26, e00422.
Nguyen, T. T., Pham, T. D., Nguyen, C. T., Delfos, J., Archibald, R., Dang, K. B., ... & Ngo, H. H. (2022). A novel intelligence approach based active and ensemble learning for agricultural soil organic carbon prediction using multispectral and SAR data fusion. Science of the Total Environment804, 150187.
Ng, W., Minasny, B., Montazerolghaem, M., Padarian, J., Ferguson, R., Bailey, S., & McBratney, A. B. (2019). Convolutional neural network for simultaneous prediction of several soil properties using visible/near-infrared, mid-infrared, and their combined spectra. Geoderma352, 251-267.
Padarian, J., Minasny, B., & McBratney, A. B. (2019). Using deep learning to predict soil properties from regional spectral data. Geoderma Regional16, e00198.
Padarian, J., Minasny, B., & McBratney, A. B. (2019). Using deep learning for digital soil mapping. Soil5(1), 79-89.
Rad, M. R. P., Toomanian, N., Khormali, F., Brungard, C. W., Komaki, C. B., & Bogaert, P. (2014). Updating soil survey maps using random forest and conditioned Latin hypercube sampling in the loess derived soils of northern Iran. Geoderma, 232, 97-106.
Rahmani, A., Sarmadian, F., Mousavi, S. R., & Khamoshi, S. E. (2019). Digital soil mapping using geomorphometric analysis and case-based fuzzy logic approach. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences42, 863-866.
Qazvin synoptic station, metrological data from 1970-2019. Iranian metrological organization. (inPersian)
Shi, T., & Xu, H. (2019). Derivation of tasseled cap transformation coefficients for Sentinel-2 MSI at-sensor reflectance data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing12(10), 4038-4048.
Shi, J., Yang, L., Zhu, A. X., Qin, C., Liang, P., Zeng, C., & Pei, T. (2018). Machine‐Learning Variables at Different Scales vs. Knowledge‐based Variables for Mapping Multiple Soil Properties. Soil Science Society of America Journal82(3), 645-656.
Schoeneberger, P. J., Wysocki, D. A., & Benham, E. C. (Eds.). (2012). Field book for describing and sampling soils. Government Printing Office.
Smith, M. P., Zhu, A. X., Burt, J. E., & Stiles, C. (2006). The effects of DEM resolution and neighborhood size on digital soil survey. Geoderma137(1-2), 58-69.
Sparks, D. L., Page, A. L., Helmke, P. A., & Loeppert, R. H. (Eds.). (2020). Methods of soil analysis, part 3: Chemical methods (Vol. 14). John Wiley & Sons.
Soil Survey Staff. (2014). Keys to Soil Taxonomy, 12th ed. USDA-Natural Resources Conservation Service.
Stumpf, F., Schmidt, K., Goebes, P., Behrens, T., Schönbrodt-Stitt, S., Wadoux, A., & Scholten, T. (2017). Uncertainty-guided sampling to improve digital soil maps. Catena153, 30-38.
Taghizadeh-Mehrjardi, R., Mahdianpari, M., Mohammadimanesh, F., Behrens, T., Toomanian, N., Scholten, T., & Schmidt, K. (2020). Multi-task convolutional neural networks outperformed random forest for mapping soil particle size fractions in central Iran. Geoderma376, 114552.
Taghizadeh-Mehrjardi, R., Minasny, B., Toomanian, N., Zeraatpisheh, M., Amirian-Chakan, A., & Triantafilis, J. (2019). Digital mapping of soil classes using ensemble of models in Isfahan region, Iran. Soil Systems3(2), 37.
Taghizadeh-Mehrjardi, R., Nabiollahi, K., Minasny, B., & Triantafilis, J. (2015). Comparing data mining classifiers to predict spatial distribution of USDA-family soil groups in Baneh region, Iran. Geoderma253, 67-77.
 Tziolas, N., Tsakiridis, N., Ben-Dor, E., Theocharis, J., & Zalidis, G. (2020). Employing a multi-input deep convolutional neural network to derive soil clay content from a synergy of multi-temporal optical and radar imagery data. Remote Sensing12(9), 1389.
Van Wambeke, A. R. (2000). The Newhall Simulation Model for estimating soil moisture & temperature regimes. Conservation Service: Department of Crop and Soil Sciences Cornell University, Ithaca, NY USA.
Wadoux, A. M. C. 2019. Using deep learning for multivariate mapping of soil with quantified uncertainty. Geoderma, 351, 59-70.
Yang, L., Cai, Y., Zhang, L., Guo, M., Li, A., & Zhou, C. (2021). A deep learning method to predict soil organic carbon content at a regional scale using satellite-based phenology variables. International Journal of Applied Earth Observation and Geoinformation102, 102428.
Yang, J., Wang, X., Wang, R., & Wang, H. (2020). Combination of Convolutional Neural Networks and Recurrent Neural Networks for predicting soil properties using Vis–NIR spectroscopy. Geoderma380, 114616.
Yan, Y., Kayem, K., Hao, Y., Shi, Z., Zhang, C., Peng, J., & Li, B. (2022). Mapping the Levels of Soil Salination and Alkalization by Integrating Machining Learning Methods and Soil-Forming Factors. Remote Sensing14(13), 3020.
Zhang, J., Tian, H., Wang, P., Tansey, K., Zhang, S., & Li, H. (2022). Improving wheat yield estimates using data augmentation models and remotely sensed biophysical indices within deep neural networks in the Guanzhong Plain, PR China. Computers and Electronics in Agriculture192, 106616.
Zeraatpisheh, M., Ayoubi, S., Jafari, A., & Finke, P. (2017). Comparing the efficiency of digital and conventional soil mapping to predict soil types in a semi-arid region in Iran. Geomorphology285, 186-204.
Zinck, J. A., Metternicht, G., Bocco, G., & Del Valle, H. F. (Eds.). (2015). Geopedology: An integration of geomorphology and pedology for soil and landscape studies. Springer.