Comparison of Different Data Mining Methods for Digital Mapping of Soil Particle-size Fractions in Lands of Semnan Plain

Document Type : Research Paper


1 Department of Management of Arid Areas, Faculty of Desertification, University of Semnan, Semnan, Iran

2 Department of Desertification, Faculty of Desertification, University of semnan, Semnan, Iran

3 Ph D. Student in Department of Management of Arid Areas, Faculty of Desertification University of Semnan, Semnan, Iran

4 Department of Statistics, Faculty of Basic Sciences, Semnan University


Knowledge about the spatial distribution of particle-size fractions in different areas is required for various land management applications and resources, modeling, and monitoring practices. In recent years, with the advancement of data mining methods and the availability of cheap data from satellite imagery, digital soil mapping methods have been developed to predict the spatial distribution of primary soil particles. The objective of this study was to conduct a spatial prediction of particle-size fractionssuch as clay, sand and silt using digital soil mapping in agricultural lands in Semnan. To achieve these goals, a total of 84 soil samples were collected from 0 to 20 cm of soil surface. Also, the environmental variables were obtained from OLI Satellite Landsat to make dependence with soil particles. In this study a linear model such as Partial Least Squares (PLS) and two non-linear models, including Random Forest (RF) and Stochastic Gradient Boosting Machin (GBM) were used for spatial prediction of particle-size fractions. The models were calibrated and validated by the 10-fold cross-validation methods. Three statistics, such as Root Mean Squared Error (RMSE), Coefficient of determination (R2), and Mean Absolute Error (MAE) were used to determine the performance of the investigated models. Values of RMSE, R2, and MAE statics of RF model for prediction of sand, silt and clay were (15.6, 0.35, 12.62), (11.49, 0.33, and 9.34), and (8.42, 0.28, and 5.9), respectively. These results indicated that the most accurate model for the prediction of particle-size fractions was the RF model. Also, the results showed that the most important environmental covariates for predicting particle-size fractions were band 10 (B10), band 5 (B5), and the gypsum index (GI). This indicated that the variables containing the near-infrared and infrared thermal waves had a major contribution to explaining the spatial patterns of particle-size fractions.


Main Subjects

Bellinaso, H., Demattê, J. A. M., and Romeiro, S. A. (2010). Soil spectral library and its use in soil classification. Revista Brasileira de Ciência do Solo. 34(3): 861-870.
Curcio D., Ciraolo G., D’Asaro F., and Minacapillia M. (2013). Prediction of soil texture distributions using VNIR SWIR reflectance spectroscopy. Procedia Environmental Sciences. 19:494 – 503.
Florinsky, I. V., Eilers, R. G., Manning, G. R., and Fuller, L. G. (2002). Prediction of soil properties by digital terrain modelling. Environmental Modelling & Software. 17(3): 295-311.
Forkuor, G., Hounkpatin, O. K., Welp, G., & Thiel, M. (2017). High resolution mapping of soil properties using remote sensing variables in south-western Burkina Faso: a comparison of machine learning and multiple linear regression models. PloS one, 12(1), e0170478.
Gee, G.W., and Bauder, J.W. (1986). Particle- size analysis, In: Klute, A., et al. (Ed.), Methods of soil analysis. Part1, Physical and mineralogical methods, seconded. ASA, Inc., Madison, WI, pp. 383–411.
Genuer, R., Poggi, J. M., and Tuleau-Malot, C. (2010). Variable selection using random forests. Pattern Recognition Letters, 31(14): 2225-2236.
Hastie, T., Tibshirani, R., Friedman, J., and Franklin, J. (2005). The elements of statistical learning: data mining, inference and prediction. The Mathematical Intelligencer, 27(2): 83-85.
Hengl, T., Heuvelink, G. B., Kempen, B., Leenaars, J. G., Walsh, M. G., Shepherd, K. D., and Tondoh, J. E. (2015). Mapping soil properties of Africa at 250 m resolution: Random forests significantly improve current predictions. PloS one, 10(6), e0125814.
Khanamani, A., Jafari, R., Jafari, A., Sangoony, H., and Shahbazi, A. (2011). Evaluation of soil status using RS and GIS technology   (Case study: Segzi plain). Journal of Applied RS & GIS Techniques in Natural Resource Science, 2(3): 25-37.
Kuhn, M., & Johnson, K. (2013). Applied predictive modeling (Vol. 26). New York: Springer.
 Liu, Z. P., Shao, M. A., and Wang, Y. Q. (2013). Large-scale spatial interpolation of soil pH across the Loess Plateau, China. Environmental Earth Sciences, 69(8): 2731-2741.
Mahmoudabadi, E., Karimi, A., Haghnia, G. H., and Sepehr, A. (2017). Digital soil mapping using remote sensing indices, terrain attributes, and vegetation features in the rangelands of northeastern Iran. Environmental monitoring and assessment, 189(10): 500.
Malone, B. P., McBratney, A. B., Minasny, B., and Laslett, G. M. (2009). Mapping continuous depth functions of soil carbon storage and available water capacity. Geoderma, 154(1-2): 138-152.
Makabe, S., Kakuda, K. I., Sasaki, Y., Ando, T., Fujii, H., and Ando, H. (2009). Relationship between mineral composition or soil texture and available silicon in alluvial paddy soils on the Shounai Plain, Japan. Soil science and plant nutrition, 55(2): 300-308.
McBratney, A. B., Santos, M. M., and Minasny, B. (2003). On digital soil mapping. Geoderma. 117(1-2): 3-52.
Minasny, B., and Hartemink, A. E. (2011). Predicting soil properties in the tropics. Earth-Science Reviews. 106(1-2): 52-62.
Minasny, B., and McBratney, A. B. (2006). A conditioned Latin hypercube method for sampling in the presence of ancillary information. Computers & geosciences. 32(9): 1378-1388.
Myles, A. J., Feudale, R. N., Liu, Y., Woody, N. A., and Brown, S. D. (2004). An introduction to decision tree modeling. Journal of Chemometrics: A Journal of the Chemometrics Society. 18(6): 275-285.
Nauman, T. W., and Thompson, J. A. (2014). Semi-automated disaggregation of conventional soil maps using knowledge driven data mining and classification trees. Geoderma. 213, 385-399.
R Development Core Team (2015). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna Austria.http://www.
Ryan, P. J., McKenzie, N. J., O’Connell, D., Loughhead, A. N., Leppert, P. M., Jacquier, D., and Ashton, L. (2000). Integrating forest soils information across scales: spatial prediction of soil properties under Australian forests. Forest Ecology and Management. 138(1-3): 139-157.
Scudiero, E., Skaggs, T. H., and Corwin, D. L. (2014). Regional scale soil salinity evaluation using Landsat 7, western San Joaquin Valley, California, USA. Geoderma Regional. 2: 82-90.
Summers, D., Lewis, M., Ostendorf, B., and Chittleborough, D. (2011). Visible near-infrared reflectance spectroscopy as a predictive indicator of soil properties. Ecological Indicators. 11(1): 123-131.
Taghizadeh-Mehrjardi, R., Minasny, B., Sarmadian, F., and Malone, B. P. (2014). Digital mapping of soil salinity in Ardakan region, central Iran. Geoderma. 213: 15-28.
Taghizadeh‐mehrjardi, R., Toomanian, N., Khavaninzadeh, A. R., Jafari, A., and Triantafilis, J. (2016a). Predicting and mapping of soil particle‐size fractions with adaptive neuro‐fuzzy inference and ant colony optimization in central I ran. European Journal of Soil Science. 67(6): 707-725.
Taghizadeh-Mehrjardi, R., Nabiollahi, K., and Kerry, R. (2016b). Digital mapping of soil organic carbon at multiple depths using different data mining techniques in Baneh region, Iran. Geoderma. 266: 98-110.
Tayebi, M., Naderi, M., Mohammadi, J., & Zadeh, M. H. (2018). Comparing different statistical models and pre-processing techniques for estimation of soil particles using VNIR/SWIR spectrum. Journal of Water and Soil. 32(1): 73-85. (In Farsi).
Vaysse, K., and Lagacherie, P. (2015). Evaluating digital soil mapping approaches for mapping GlobalSoilMap soil properties from legacy data in Languedoc-Roussillon (France). Geoderma Regional. 4: 20-30.
Zeraatpisheh, M., Ayoubi, S., Jafari, A., and Finke, P. (2017). Comparing the efficiency of digital and conventional soil mapping to predict soil types in a semi-arid region in Iran. Geomorphology. 285: 186-204.