Development of strategic wheat crop prediction toolkit using machine learning algorithms to reduce food security risks (case study: alborz province)

Document Type : Research Paper


Department of Irrigation and Reclamation Engineering, University of Tehran, Karaj, Iran.


Wheat as the main food in the country is of particular importance. Wheat is not only an important economic agricultural commodity in the world, but also known as a powerful lever in political and global relations. Therefore, the analysis and forecast of the production status of this product in the country has always been the focus of attention. The purpose of this study is to predict the amount of wheat yield (X) using artificial intelligence in the annual time scale in Alborz province. For this purpose, using annual cultivation and production data, wheat yield was investigated in six cities of Nazarabad, Savojbalagh, Karaj, Eshtehard, Fardis and Taleghan with a period of 40 years (1981-2020). After calculating the yield (ton per hectare) and forming an annual time series, four artificial intelligence methods including the best neighbor algorithm (KNN), backup vector (SVM), gene expression planning (GEP) and Bayesin Network (BN) were used and the wheat yield was predicted for the following year. Results indicated a more precision in yield prediction in the years with more production; According to the results of the BN, GEP, SVM and KNN model, the correlation coefficient between the observed and anticipated wheat yield values was 0.84, 0.89, 0.89 and 0.92, respectively. Explaining that Karaj and Taleghan cities have the highest and lowest wheat production respectively. The results showed that the KNN method had the best accuracy among the others, as the values of R, RMSE and MAE varied from 0.84 to 0.92, 0.21 to 0/24 and 0.11 to 0.18. Overall, by comparing the proposed methods, the KNN method had the highest and the BN method had the least accuracy to predict the amount of wheat yield in Alborz province. The results of this study can be very useful in providing and managing food security in areas under study.


  1. Alvarez R. (2009). Predicting average regional yield and production of wheat in the Argentine Pampas by an artificial neural network approach. European Journal of Agronomy. 30, 70-77. doi:10.1016/j.eja.2008.07.005.

    Araghinejad, S. H. & Burn, D. (2005). Probabilistic forecasting of hydrological events using geostatistical analysis. Hydrological Sciences Journal- des Sciences Hydrologiques, 50(5), 57-66.

    Aslam F., Salman A. & Jan I. (2019). Predicting wheat production in Pakistan by using an artificial neural network approach. Sarhad Journal of Agriculture, 35(4), 1054-1062.

    Baier, W., and Robertson, G.W. (1967). Estimating yield components of wheat from calculated soil moisture. Can. J. Plant. Sci. 47: 617-30, doi:10.4141/cjps67-108.

    Basak, D., Pal, S. & Patranabis, DC. (2007). Support vector regression. Neural Information Processing, 11, 203- 225.

    Chipanshi A.C., Ripley E.A., & Lawford R.G. (1999). Large-scale simulation of wheat yields in a semi-arid environment using a crop-growth model. Agricultural Systems. 5: 57−66, doi:10.1016/s0308-521x (98)00082-1.

    Dahiya, S., Singh, B., Gaur, S., Garg, V. K., & Kushwaha, H. S. (2007). Analysis of groundwater quality using fuzzy synthetic evaluation. Journal of Hazardous Materials, 147(3): 938-946, doi:10.1016/j.jhazmat.2007.01.119.

    Danandehmehr, A. and M.R. Majdzadeh Tabatabai. 2010. Prediction of daily discharge trend of river flow based on genetic programming. Journal of Water and Soil, 24(2): 325-33 (in Persian).

    Doraiswamy P.C., Moulin S., Cook P.W., and Stern, V. (2003). Crop yield assessment from remote sensing. Photogrammetric Engineering and Remote Sensing, 69: 665−674, doi:10.14358/pers.69.6.665.

    Ferreira, C. (2001). Gene expression programming: a new adaptive algorithm for solving problems. Complex System, 13, 87-129.

    Ferreira, C. (2006). Gene expression programming: mathematical modeling by an artificial intelligence (studies in computational intelligence). Springer-Verlag New York, Inc. Secaucus, NJ, USA.

    Han J., Zhang Z., Cao J., Luo Y., Zhang L., Li Z., and Zhang J. (2020). Prediction of winter wheat yield based on multi-source data and machine learning in China. Remote Sensing 236(12): 1-22, doi:10.3390/rs12020236.

    Hosseini, S., Sioseh Mardeh, A., Fathi, P., & Sioseh Mardeh, M. (2007). Application of artificial neural network (ANN) and multiple regression for estimating assessing the performance of day farming wheat yield in ghorveh region, Kurdistan province. Agricultural reserch, 7(1), 41-45.

    Jayawardena, A. W., Li, W. K. & Xu, P. (2002). Neighbor selection for local modelling and prediction of hydrological time series. J. of Hydrology, 258, 40-57.

    Karlsson, M. & Yakowitz, S. (1987). Nearest-neighbor methods for nonparametric rainfall-runoff forecasting. Water Resources Research, 23(7), 1300-1308.

    Kaul, M., R. Hill, C. Walthall. (2005). Artificial neural networks for corn and soybean yield prediction. Agricultural Systems. 85: 1-18. doi:10.1016/j.agsy.2004.07.009.

    Khodabandeh, n. (1998). Cereals. Fifth Edition. University of Tehran Press, Tehran, 538.

    Khoshnevisan B., Rafiee S., Omid M. & Mousazadeh H. (2014). Development of an intelligent system based on ANFIS for predicting wheat grain yield on the basis of energy inputs. Information Processing in Agriculture, 1(1), 14-22. doi:10.1016/j.inpa.2014.04.001

    Kingston, G.B., Lambert, M.F. & Maier, H.R. (2005). Bayesian training of artificial neural networks used for water resources modeling. Water Resources Research, 41(12), 11.

    Koocheki, A., Kamali, GH.A. & Banaian, M. (1993). Simulation of primary production. The center of agrobiological research and department of theoretical production ecology, Wageningen, Netherlands. Published by World Meteorological Organization. Geneva, July. 219p.

    Kshirsagar, A.M. (1972). Multivariate analysis. Marcel Decker, Inc., New York.

    MacKay, D.J.C. (1992). A practical Bayesian framework for backpropagation networks. Neural Computation, 4(3), 448-472.

    MaselliÙˆ F., and Rembold, F. (2001). Analysis of GAC NDVI data for cropland identification and yield forecasting in Mediterranean African countries. Photogrammetric Engineering and Remote Sensing, 67: 593−602.

    Mehnatkesh, A., Ayyubi, S., Jalalyan, A., & Dehgani, A. (2016). Comparison of multivariate linear regression and artificial neural networks models for estimating of rainfed wheat yield in some central Zagros areas. Iranian Dryland Agronomy Journal, 5(2), 119-133.

    Meshkani, A. & Nazemi, A. (2009). Introduction to data mining. Ferdowsi University of Mashhad, 456 pages. (in Farsi)

    Misra, D., Oommen, T., Agarwa, A., Mishra, SK. & Thompson, AM. (2009). Application and analysis of supportvector machine based simulation for runoff andsediment yield. Biosystems Engineering. 103(3), 527–535.

    Norouzi M., Ayoubi S., Jalalian A., Khademi H., and Dehghani A.A. (2010). Predicting rainfed wheat quality and quantity by artificial neural network using terrain and soil characteristics. Acta Agric Scandinavica, Section B-Plant Soil Sciences, 60, 341-352. doi:10.1080/09064710903005682.

    Pagie, L. & Mitchell, MA. (2002). Comparison of evolutionary and coevolutionary search. International Journal of Computational Intelligence and Application, 2, 53–69.

    Rahmani, E., Liaghat, A. & Khalili, A. (2010). Estimating Barley Yield in Eastern Azerbaijan Using Drought Indices and Climatic Parameters by Artificial Neural Network (ANN). Iranian Journal of Soil and Water Research, 39(1), 47-56.

    Servati, M., Barikloo, A., Alamdari, P. & Moravej, K. (2018). Application of Heuristic Methods in Prediction of Wheat Yield. Applied Soil Research, 6(3), 106-117.

    shahinejad, B. (2018). Comparison of wavelet neural network models, support vector machine and gene expression programming in estimating the amount of oxygen dissolved in rivers. Iran-Water Resources Research, 14(3), 226-238.

    Sharif, M. H. & Burn, D. (2006). Simulating climate change scenarios using an improved K-nearest neighbor model. J. of Hydrology, 325, 179-196

    Singh, V.P., Translation, M.R. Najafi. (2002). Hydrological systems for rainfall modeling. Tehran University Press, First Edition, 578 pagesM. (in Farsi)

    Tarboton, D. G., Sharma, A., and Lall, U. (1993). The use of non-parametric probability distribution in streamflow modeling. In Proceeding of the 6 South African National Hydrological Symposium, Ed. S. A

    Todeschini, R. (1989). K-nearest neighbour method: Influence of data transformations and metrics. Chemometrics and Intelligent Laboratory Systems, 6, 213-220.

    Uossef gomrokchi, A., Baghani, J. & Abbasi, F. (2021). Evaluating the Capability of Data Mining Models in Predicting Irrigated Wheat Yield in Iran. Water and Soil, 35(2), 189-202. doi: 10.22067/jsw.2021.15029.0

    1. Vapnik VN (1995) The nature of statistical learning theory. Springer, New York.

    Veelenturf L.P.J. (1995). Analysis applications of artificial neural networks. Simon and Schuster International Group, United States of America.

    Verma, U., Koehler, W., and Goyal, M. (2012). A study on yield trends of different crops using ARIMA analysis. Environ. Ecol., 30(4A), 1459-1463.

    Wall L., Larocque D., and Leger P.M. (2007). The early explanatory power of NDVI in crop yield modeling. International Journal of Remote Sensing. 29: 2211−2225.

    Wu F.Y., and Yen K.K. (1992). Application of neural network in regression analysis. Computer and Industrial Engineering. 23: 93-98.

    Yakowitz, S. J. (1985). Nonparametric density estimation, prediction, and regression for markov sequences. J. Am. Stat. Assoc., 80, 215-221.

    Yoon H, Jun SC, Hyun Y, Bae GO, Lee KK (2011) Acomparative study of artificial neural networks andsupport vector machines for predicting groundwaterlevels in a coastal aquifer. Journal of Hydrology, 396(4), 128–138.