Evaluation of three data mining methods to estimate reference evapotranspiration in Zanjan province

Document Type : Research Paper


1 Department of Soil Science, Faculty of Agriculture, University of Guilan, Rasht, Iran

2 Associate Professor of Irrigation and Soil Physics, Soil and Water Research Institute, Agricultural Research and Education Organization, Karaj, Iran

3 Assistant professor, Department of irrigation and soil physics, Soil and Water Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran

4 Researcher, Department of Irrigation and Soil Physics, Soil and Water Research Institute, Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran.


Reference evapotranspiration (ET0), a complex hydrological variable affecting crop water requirements and irrigation scheduling, is defined by a number of climatic factors that have an impact on water and energy balances. On the basis of accurate climatic data, conventional methods for calculating ET0 include a variety of empirical approaches. But there are a lots of locations where different climatic information might not be available for ET0 estimation.
The objective of this study is to evaluate different data mining methods to estimate ET0 with limited meteorological data. This study aims to answer the question: can reference evapotranspiration be estimated without reducing accuracy, regardless of the availability of all variables? In this research, the accuracy of data mining methods in estimating ET0 with respect to the plant water demand system (FAO Penman-Monteith standard method) was evaluated
Materials and methods:
 Data such as sunshine hour, air temperature, wind speed, and relative humidity from thirteen climatology stations in the Zanjan province over a ten-year period (2010-2021) were collected. The ET0 was calculated using the FAO56 Penman-Mantith method on a daily time scale (as refrence method) and the estimated values obtained by data mining methods (Artificial Neural Network (ANNs), Random Forest (RF) and Support Vector Machine (SVM)) were evaluated. The data from each station were divided into two sets: training (two-thirds of the data) and testing (one-third of the data) in order to calibrate and validate the proposed methods. Finally, based on NRMSE, RMSE, MBE, and EF criteria, the generalizability of the aforementioned methods for estimating ET0 was examined.
Results and discussion:
 According to the results, ANNs performed better than SVM and RF methods. The mean values of, RMSE, EF and NRMSE criteria for the ANNs method in the training and testing steps were 0.49, 0.94 and 0.14, respectively. The mean values of these criteria for RF method in the training step were 0.49, 0.94 and 0.14 and in the testing step was 0.52, 0.94 and 0.15, respectively. The mean values of these criteria for the SVM method for both (training and testing) steps were 0.52, 0.94 and 0.15, respectively.
The average air temperature is the most significant and effective parameter to estimate ET0, according to more than 92 percent (12 stations) of the results obtained from two ANNs and RF methods. The sunshine hours is the second-most crucial and useful input in estimating ET0, according to more than 84 percent (11 stations) of the results. As a result, using four meteorological variables such as average air temperature, average relative humidity, wind speed, and sunshine hours as input, excellent performance can be achieved. The NRMSE values obtained from ET0 estimation did not exhibit regular variations with the average values of parameters (temperature, humidity, wind speed, sunshine hours, slope percentage).
Conclusion: It was found that the average air temperature was the most crucial and useful parameter as a result of the sensitivity analysis of the ANNs method and the Predictor Importance of the RF method. According to the current study, Pari and Zanjan stations outperformed than the other stations in Zanjan province, probably due to their plainer conditions. The results of the current study will help to estimate ET0 for semi-arid climates where ET0 is critical for agricultural water resource management.


Main Subjects

Achirul Nanda, M., Boro Seminar, K., Nandika, D., and Maddu, A. (2018). A comparison study of kernel functions in the support vector machine and its application for termite detection. Information, 9(1), 5.‏
Adab, H., Morbidelli, R., Saltalippi, C., Moradian M., and Ghalhari, G. A. F. )2020(. Machine learning to estimate surface soil moisture from remote sensing data. Water, 12 (11), 3223.
Adnan, S., Ullah, K., and Ahmed, R. (2020). Variability in meteorological parameters and their impact on evapotranspiration in a humid zone of Pakistan. Meteorological Applications, 27(1), e1859.‏
Alexandris, S., and Proutsos., N. (2020). How significant is the effect of the surface characteristics on the Reference Evapotranspiration estimates. Agric. Water Manag, 237, 106181.
Algretawee, H., and Alshama, G. (2021). Modeling of Evapotranspiration (ETo) in a Medium Urban Park within a Megacity by Using Artificial Neural Network (ANN) Model. Periodica Polytechnica Civil Engineering. 65(4): 1260–1268.
Allen, R. G., Pereira, L. S., Howell, T. A., and Jensen, M. E. (2011). Evapotranspiration information reporting: I. Factors governing measurement accuracy. Agricultural Water Management, 98(6), 899-920.‏
Allen, R. G., Pereira, L.S., Raes, D., and Smith, M. (1998). Crop evapotranspiration-guidelines for computing crop water requirements-fao irrigation and drainage paper 56. Fao, Rome, 300(9), D05109.
Aslami, F., Ghorbani, A., Sobhani, B., and Panahandeh, M. (2015). Comparing artificial neural network, support vector machine and object-based methods in preparation land use/cover mapsusing landSat-8 images.‏ Journal of RS and GIS for Natural Resources, 6(3), 1-14.
Ayaz, A., Rajesh, M., Singh, S. K., and Rehana, S. (2021). Estimation of reference evapotranspiration using machine learning models with limited data. AIMS Geosciences7(3), 268-290.‏
Bandyopadhyay, A., Bhadra, A., Raghuwanshi, N. S., and Singh, R. (2009). Temporal trends in estimates of reference evapotranspiration over India. Journal of Hydrologic Engineering, 14(5), 508-515.‏
Bayat, H., Ebrahimzadeh, G., and Mohanty, B.P. (2021) Investigating the capability of estimating soil thermal conductivity using topographical attributes for the Southern Great Plains, USA. Soil and Tillage Research, 206, 104811.
Berry, W.D.(1993). Understanding Regression Assumptions. Sage Publications, London.
Bidabadi, M., Babazadeh, H., Shiri, J., and Saremi, A. (2022). Estimation of Reference Crop Evapotranspiration Using ANN and ANFIS in Semi-Arid and Dry Climates. Iranian Journal of Irrigation & Drainage, 15(6), 1412-1420. )in Persian(.
Boateng, E. Y., Otoo, J., and Abaye, D. A. (2020). Basic tenets of classification algorithms K-nearest-neighbor, support vector machine, random forest and neural network: a review. Journal of Data Analysis and Information Processing8(4), 341-357.
Boser, B.E., Guyon, I.M., and Vapnik, V.N. (1992). A training algorithm for optimal margin classifiers. In D.Haussler, editor, 5th Annual ACM Workshop on COLT, pages 144-152, Pittsburgh, PA.
Breiman, L. (2001). Random forests. Machine Learn. 45: 5–32.
Cherkassky V, Ma Y (2004) Practical selection of SVM parameters and noise estimation for SVM regression. Neural networks 17, 113-126.
Dixon, B., and Candade, N. (2008). Multispectral landuse classification using neural networks and support vector machines: one or the other, or both?. International Journal of Remote Sensing, 29(4), 1185-1206.‏
Fan, J., Ma, X., Wu, L., Zhang, F., Yu, X., and Zeng, W. (2019). Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data. Agric Water Manag, 225,105758.
Feng, K., and Tian, J. (2021). Forecasting reference evapotranspiration using data mining and limited climatic data. European Journal of Remote Sensing, 54(sup2), 363-371
Feng, Y., Peng, Y., Cui, N., Gong, D., and Zhang, K. (2017). Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput Electron Agric, 136, 71–78.
Feng, Y., Peng, Y., Cui, N., Gong, D., and Zhang, K. (2017). Modeling reference evapotranspiration using extreme learning machine and generalized regression neural network only with temperature data. Comput. Electron. Agric. 136, 71–78.
Ferreira, L.B., França, F., Oliveira, R.A., De, I.E., and Filho, F. (2019). Estimation of reference evapotranspiration in Brazil with limited meteorological data using ANN and SVM ; a new approach. J Hydrol.
Gill, M., Kemblowski, M.W., and McKee, M. (2007). Soil moisture data assimilation using support vector machines and ensemble Kalman filter 1. JAWRA. Journal of the American Water Resources Association, 43(4), 1004-1015.
Gopinathan, K.K. (1988). A general formula for computing the coefficients of the correlation connecting global solar radiation to sunshine duration. Solar energy, 41, 499-502
Goudarzi, M., Salahi, B., and Hosseini, S. A. (2018). Estimation of evapotranspiration rate due to climate change in the Urmia Lake Basin. Iranian Journal of Watershed Management Science and Engineering, 12(41), 1-12. )in Persian(.
Hagan, M.T., Demuth, H.B., and Beale, M.H. (1996). Neural Network. Design PWS Publishing Co.
Hocking, R.R. (2013). Methods and Applications of Linear Models: Regression and The Analysis of Variance. John Wiley & Sons.
Karimipour, A., and Banitalebi, G. (2020). Sensitivity analysis of meteorological data in estimating reference evapotranspiration with the minimum data using wavelet-neuro-fuzzy, ANN and ANFIS models. Journal of Soil and Water Resources Conservation, 9(3), 47-72. )in Persian(.
Kisi, O., and Alizamir, M., (2018). Modelling reference evapotranspiration using a new wavelet conjunction heuristic method: wavelet extreme learning machine vs wavelet neural networks. Agric. For. Meteorol. 263, 41–48.
Kotsiantis, S., and Pintelas, P. (2004). Combining bagging and boosting. Journal of Computational Intelligence. 1(4): 324–33.
Kulkarni, V.Y., and Sinha P.K. (2014). Effective learning and classification using random forest algorithm. International Journal of Engineering and Innovative Technology (IJEIT), 3, p. 267-273.
Liang, L., Lijuan, L., and Qiang, L. (2010). Temporal variation of reference evapotranspiration during 1961-2005 in the Taoer river basin of Northeast China. Agricultural and Forest Meteorology, 150, 298-306.
Lin G., Chen G., Huang P., and Chou Y. (2009). Support vector machine-based models for hourly reservoir inflow forecasting during typhoon-warning periods. Journal of hydrology. Vol. 372, 17–29.
Liu, W., Yang, L., Zhu, M., Adamowski, J. F., Barzegar, R., Wen, X., and Yin, Z. (2021). Effect of elevation on variation in reference evapotranspiration under climate change in northwest china. Sustainability13(18), 10151.‏
Mattar, M.A., (2018). Using gene expression programming in monthly reference evapotranspiration modeling: a case study in Egypt. Agric. Water Manage. 198, 28–38.
Mehdizadeh, S. (2018). Estimation of daily reference evapotranspiration (ETo) using artificial intelligence methods: offering a new approach for lagged ETo data-based modeling. J. Hydrol. 559, 794–812.
Mehrazar, A., Massah Bavani, A., Mashal, M., and Rahimikhoob, H. (2018). Assessment of climate change impacts on agriculture of the Hashtgerd Plain with emphasis of AR5 models uncertainty. Irrigation Sciences and Engineering, 41(3), 45-59. )in Persian(.
Minasny, B., and McBratney, A. B. (2002). The neuro-m method for fitting neural network parametric pedotransfer functions. Soil Sci. Soc. Am. J, 66 (2), 352– 361.
Nie, T., Yuan, R., Liao, S., Zhang, Z., Gong, Z., Zhao, X., and Jiang, H. (2022). Characteristics of Potential Evapotranspiration Changes and Its Climatic Causes in Heilongjiang Province from 1960 to 2019. Agriculture, 12(12), 2017.‏
Ning, T.T. (2017). Spatial-Temporal Variation of Evapotranspiration in the Loess Plateau under Budyko Framework and Its Attribution Analysis; Research Center for Soil and Water Conservation and Eco-Environment of the Ministry of Education; Chinese Academy of Sciences: Beijing, China.
Pal, M. (2006). M5 model tree for land cover classification. International Journal of Remote Sensing, 27(4), 825-831
Panaitescu, L., Ilie, C., Lungu, M. L., Popescu, M., Lungu, D., and Nita, S. (2014). Modern approach to the phenomenon of drought and aridity in Central and South Dobrudja. Journal of Environmental Protection and Ecology, 15(1), 110-122.‏
Petropoulos, G. P., Ireland, G., and Barrett, B. (2015). Surface Soil Moisture Retrievals from Remote Sensing: Current Status, Products & Future Trends, Physics and Chemistry of the Earth, 83 (84), 36-56.
Picton, P. (2000) .Neural Networks, 2nd edn. Palgrave, New York.
Poormohammadi, S., Malekinezhad, H., and Rahimian, M. H. (2010). Investigating the role of physiographical factors on temperature-related parameters affecting evapotranspiration (Case study: Yazd province). Journal of Arid Biome, 1(2), 9-19. )in Persian(.
Rahimikhoob, A. (2014). Comparison between m5 model tree and neural networks for estimating reference evapotranspiration in an arid environment. Water resources management, 28(3),657–669.
Raziei, T., Daneshkar Arasteh, P., and Saghafian, B. (2005). Annual rainfall trend analysis in arid and semi-arid regions of central and eastern Iran. Water and Wastewater, 54, 73-81. )in Persian(.
Roderick, M.L., and Farquhar, G.D. (2002). The cause of decreased pan evaporation over the past 50 years. Science, 298(5597), 1410-1411.‏
Rodriguez-Galiano, V., Chica-Olmo, M., Abarca-Hernandez, F., Atkinson P., and Jeganathan, C. (2012). Random Forest classification of Mediterranean land covers using multi-seasonal imagery and multi-seasonal texture. Journal of Remote Sensing of Environment, 121: 93-107.
Sandhu, R., and Irmak, S. (2020). Performance assessment of Hybrid-Maize model for rainfed, limited and full irrigation conditions. Agricultural Water Management, 242, 106402.
Sedaghat, A., Shahrestani, M.S., Noroozi, A.A., Nosratabad, A.F., and Bayat, H. (2022). Developing pedotransfer functions using Sentinel-2 satellite spectral indices and Machine learning for estimating the surface soil moisture. Journal of Hydrology, 127423.
Sedaghat, A., Shabanpour, M., Noroozi, A., Fallah Nosratabad, A., and Bayat, H. (2022). The use of spectral indices to estimate soil surface moisture using machine learning algorithms. Iranian Journal of Soil and Water Research, 52(12), 3001-3018. )in Persian(.‏
Sepehri, S., Abbasi, F., Zarei, G., and Nakhjavanimoghaddam, M. M. (2021). Investigation of Artificial Neural Network Based Models and Sensitivity Analysis for Reference Evapotranspiration Estimating. Iranian Journal of Irrigation & Drainage, 14(6), 2089-2099. )in Persian(.‏
Shi, Y. (2019). Climate Change on the Tibetan Plateau and Its Impact on Potential Evapotran-Spiration. Beijing Forestry University: Beijing, China.‏
Shiri J. (2017). Evaluation of FAO56-PM, empirical, semi-empirical and gene expression programming approaches for estimating daily reference evapotranspiration in hyper-arid regions of Iran. Agric Water Manag, 188, 101–114
Singh A., Haghverdi, A., Öztürk, H.S., and Durner, W. (2020). Developing Pseudo Continuous Pedotransfer Functions for International Soils Measured with the Evaporation Method and the HYPROP System: I. The Soil Water Retention Curve. Water, 12, 3425.
Su, J.W., Zhang, X.L., and Shen, B. (2021). Spatio-temporal variation characteristics and influencing factors of potential evapotranspiration in Heilongjiang. Heilongjiang Water Conserv. Sci. Technol, 49, 1–8.
Tabari, H., and Talaee, P. H. (2014). Sensitivity of evapotranspiration to climatic change in different climates. Global and Planetary Change, 115, 16-23.‏
Tabari, H., Hosseinzadeh Talaee, P., and Willems, P. (2014). Links between Arctic Oscillation (AO) and inter-annual variability of Iranian evapotranspiration, Quaternary International, 345, 148-157.
Tafteh, A., Davatgar, N., and Sedaghat, A. (2022). Estimation of important points on soil water retention curve (SWRC): comparison experimental-physical models and data mining technique. Arabian Journal of Geosciences, 15(10), 1-13.‏
Vapnik, V.N. (1998). Statistical Learning Theory. Wiley, New York. 736 pp.
Wang, S., Lian, J., Peng, Y., Hu, B., and Chen H. (2019). Generalized reference evapotranspiration models with limited climatic data based on random forest and gene expression programming in Guangxi, China. Agric Water Manag, 221, 220–30.
Wen, X., Si, J., He, Z., Wu, J., Shao, H., & Yu, H. (2015). Support-vector-machine-based models for modeling daily reference evapotranspiration with limited climatic data in extreme arid regions. Water resources management, 29(9), 3195-3209.‏
Wu, L., Zhou, H., Ma, X., Fan, J., and Zhang, F. (2019). Daily reference evapotranspiration prediction based on hybridized extreme learning machine model with bio-inspired optimization algorithms: Application in contrasting climates of China. J Hydrol, 577,123960.
Xavier, F., Tanaka, A. K., and Amorim, F. A. (2016). Application of data science techniques in evapotranspiration estimation, (Master's thesis).
Yang, Y., Chen, R., Song, Y., Han, C., Liu, J., and Liu, Z. (2019). Sensitivity of potential evapotranspiration to meteorological factors and their elevational gradients in the Qilian Mountains, northwestern China. J. Hydrol, 568, 147–159
Yassin, MA., Alazba, A.A., and Mattar, MA. (2016). Artificial neural networks versus gene expression programming for estimating reference evapotranspiration in arid climate. Agric Water Manag, 163, 110–24.
Zhou, B. R., Li, F. X., Xiao, H. B., Hu, A. J., and Yan, L. D. (2014). Temporal and spatial differentiation characteristics of potential evapotranspiration and climate attribution in the source region of the Three Rivers. J. Nat. Resour, 29, 2068-2077.‏
Zorati Pur, E., Neisi, L., Golabi, M., Bazaz, A., and Zoratipur, A. (2019). Simulation and Comparison of Potential Evapotranspiration by Artificial Neural Networks, ANFIS (Fuzzy Neural Network) and Decision Making M5 (Case Study; Synaptic Station of Shiraz). Iran-Water Resources Research, 15(1), 365-371.‏ )in Persian(.