Comparing machine learning algorithms for estimating PM10 particle concentration using AOD and selected meteorological parameters

Document Type : Research Paper

Authors

1 Urban Planning Expert mollasani iran.

2 Department of Soil Sciences, Faculty of Agriculture, University of Khuzestan Agricultural Sciences and Natural Resources, Mollasani, Iran.

3 Department of Soil Science, Shahid Chamran University of Ahvaz, Ahvaz, Khuzestan, Iran

4 Department of Software Engineering, Islamic Azad University, Ahvaz Branch, Khuzestan, Iran

Abstract

Monitoring and controlling the level and sources of dust are crucial in the face of climate change and the development of suitable predictive approaches that directly impact the environment and human health. This study aims to estimate the concentration of PM10 in the city of Ahvaz using various machine learning models. Climate variables and the Aerosol Optical Depth (AOD) index, derived from the MODIS sensor at a wavelength of 476 nanometers, were used as influential variables in estimating PM10 concentration in three scenarios: combining AOD with PM10 (scenario 1), combining climate variables with PM10 (scenario 2), and combining climate variables and AOD with PM10 (scenario 3) .Using six machine learning algorithms, namely Random Forest Regression (RFR), Gradient Boosting Regression (GBR), Artificial Neural Networks (ANN), AdaBoostR with DTR, Support Vector Regression (SVR), and Decision Tree Regression (DTR), the PM10 concentration was estimated in different scenarios, considering accuracy and precision coefficients. The most influential variables in estimating PM10 concentration were determined to be sunshine hours, minimum visibility, maximum wind speed, and the AOD index. The GBR linear regression model, with R2, MAE, RMSE, and IOA coefficients of 0.76, 0.31, 0.49 and 0.93 respectively, was found to be the most suitable model for estimating PM10 concentration in scenario 3. the results showed that incorporating the AOD index alongside climate variables improved the model's performance in estimating PM10 concentration. The proposed final model can be used for daily estimation of PM10 particles.

Keywords

Main Subjects


EXTENDED ABSTRACT

Introduction

Monitoring and controlling the levels and sources of dust, influenced by climate change, and developing appropriate predictive approaches that have direct impacts on the environment and human health are of great importance. Dust storms are one of the main reasons for the dispersion of airborne particles with an aerodynamic diameter of less than 10 micrometers (PM10) in the air of dry regions worldwide, including the dry and desert regions of Iran. These storms occur more severely and with higher concentrations than in the past, leading to adverse environmental effects.  One of the negative consequences of increased particulate matter concentration is the health risks posed to residents in these areas. Among these, the southern and southeastern regions of Ahvaz are recognized for having the largest area of origin centers of dust storms in Khuzestan Province. This study was conducted with the aim of estimating the PM10 particle concentrations in the city of Ahvaz using various machine learning models.

Methodology

In this research, climate variables and the Aerosol Optical Depth (AOD) index, derived from the MODIS sensor at a wavelength of 476 nanometers, were used as influential variables in estimating PM10 concentration in three scenarios: combining AOD with PM10 (scenario 1), combining climate variables with PM10 (scenario 2), and combining climate variables and AOD with PM10 (scenario 3) .Using six machine learning algorithms, namely Random Forest (RF), Gradient Boosting Regression (GBR), Artificial Neural Networks (ANN), AdaBoostR with DTR, Support Vector Regression (SVR), and Decision Tree Regression (DTR), the PM10 concentration was estimated in different scenarios, considering accuracy and precision coefficients.

Results and Discussion

The most influential variables in estimating PM10 concentration were determined to be sunshine hours, minimum visibility, maximum wind speed, and the AOD index. The results indicated that the method of combining the input variables of the AOD index and meteorological parameters using the GBR algorithm showed the best performance with the highest accuracy and precision coefficients, including MAE = 0.31,  RMSE = 0.49, IOA=0.93, R2 = 0.76 . The approach using only the AOD index  showed the worst performance in estimating PM10 levels, with accuracy and precision coefficients, including  MAE = 0.40, RMSE = 0.64, IOA=0.82, R2 = 0.59. The approach using only meteorological parameters showed intermediate performance in estimating PM10 levels, with accuracy and precision coefficients, including  MAE = 0.37, RMSE = 0.61, IOA=0.88, R2 = 0.62. The pixel size of the satellite image used in the MODIS sensor is one kilometer, which in comparison to the dimensions of ground-based PM10 pollution monitoring stations, has a larger area. This difference in area creates uncertainty in the results of models that solely rely on satellite image data.The use of influential meteorological variables alongside the AOD index for modeling PM10 pollutant concentrations has shown remarkable performance. In the field of machine learning studies, utilizing multiple models for making predictions is crucial. By providing more accurate predictions in complex processes, focusing on improving these models for optimal management can ultimately contribute to advancing more sustainable and cost-effective operations. The proposed final model can be used for daily estimation of PM10 particles.

Author Contributions

Conceptualization, F.Kh. and E.Kh.; methodology, F.Kh. and E.Kh.; software, E.Kh.; validation, E.Kh.; formal analysis, F.Kh. and E.Kh.; investigation, F.Kh. and E.Kh.; resources F.Kh. and E.Kh.; data curation, F.Kh. and E.Kh.; writing—original draft preparation, F.Kh. and E.Kh.; writing—review and editing, M.R.A and S.H; visualization, M.R.A., S.H. and E.Kh.; supervision, M.R.A., S.H. and E.Kh.; project administration, M.R.A., S.H. and E.Kh.; funding acquisition, M.R.A.

All authors have read and agreed to the published version of the manuscript.

Data Availability Statement

Data is available on reasonable request from the authors.

Acknowledgements

The authors would like to thank Agricultural Sciences and Natural Resources University of Khuzestan, for providing all the needed facilities.

Ethical considerations

The authors avoided data fabrication, falsification, plagiarism, and misconduct.

Conflict of interest

The author declares no conflict of interest.

Afzali, A., Rashid, M., Sabariah, B., Ramli, M., (2014). PM10 Pollution: Its Prediction and Meteorological Influence in
PasirGudang, Johor.8th International Symposium of the Digital Earth (ISDE8). IOP Conf. Series: Earth and Environmental Science 18: 012100. doi:10.1088/1755-1315/18/1/012100.
Akbari, A., Fakheri, M., Poorgholamhossin, A., Akbari, Z. (2016). Monthly Zoning of the Air Pollution and Surveying its
Relationship with Climatic Factors (Case Study: Mashhad City). Journal of Natural Environment. 68(4): 533- 547. (In Persian)
Alimahmoodi Sarab, S., Shataee Jouybari, S., & Rashki, A. (2018). The Estimate of Dust Concentration Using of Weather Variable (A Case study: Ahvaz City). Journal of Natural Environment, 71(3), 385-397. (In Persian).
Alizadeh-Choobari O, Ghafarian P, Owlad E. (2016).Temporal variations in the frequency and concentration of dust events over Iran based on surface observations, https://doi.org/10.1002/joc.4479.
Alizadeh-Choobari, O., Zawar-Reza, P., Sturman, A. (2014). The “wind of 120days” and dust storm activity over the Sistan Basin. Atmospheric Research, 143: 328–341. https://doi.org/10.1016/j.atmosres.2014.02.001 
Ashpole I, Washington R . (2013). A new high-resolution central and western Saharan summertime dust source map from automated satellite dust plume tracking. J Geophys Res 118:6981–6995. doi:10.1002/jgrd.50554.
Asl, S. Z., Farid, A., & Choi, Y. S. (2019). Assessment of CALIOP and MODIS aerosol products over Iran to explore air quality. Theoretical and Applied Climatology, 137(1-2), 117-131.
Azhdari, A., Heidarian, P.,  Jodaki, M., Darvishi Khatooni, J., & Shahbazi, R. (2017). Identifying interior sources of dust storms using remote sensing, GIS and geology (case study: Khuzestan province). Scientific Quarterly Journal of Geosciences, 27(105), 33-4. (In Persian).
Behrooz Sahebzadeh, Bozdağ A, Dokuz Y,  Begüm Gökçek Ö .(2020). Spatial prediction of PM10 concentration using machine learning
 algorithms in Ankara, Turkey, Environmental PollutionVolume 263, Part A, August 2020, 114635.
Breiman, L.( 2001). Random forests. Mach. Learn, 45, 5–32.
Butt M.J, Ebraheem Assiri M, Md. Arfan A. (2017). Assessment of AOD variability over Saudi Arabia using MODIS Deep Blue products, Environmental PollutionVolume 231, Part 1, December 2017, Pages 143-153.
Cao, H., Amiraslani, F., Liu, J., & Zhou, N. (2015). Identification of dust storm source areas in West Asia using multiple environmental datasets. Science of the Total Environment, 502, 224-235.
Chai, C. P. (2023). “Comparison of text preprocessing methods”, Natural Language Engineering, 29 (3), 509-553.
Chelani, A. B. (2019). Estimating PM2. 5concentrations from satellite derived aerosol optical depth and meteorological variables using a combination model. Atmospheric Pollution Research, 10(3), 847-857.
Clarke, A. D., Collins, W. G., Rasch, P. J., Kapustin, V. N., Moore, K., Howell, S. and Fuelberg, H. E. (2001). Dust and pollution transport on global scales: Aerosol measurements and model predictions. Journal of Geophysical Research: Atmospheres, 106(D23), 32555-32569.
Daryanoosh, M., Goudarzi, G., Rashidi, R., Keishams, F., Hopke, P.K., Mohammadi, M.J. (2018  .(Risk of morbidity attributed to ambient PM10 in the western cities of Iran. Toxin Reviews 37, 313-318.
DeRousseau, M.A.Laftchiev, E.Kasprzyk, J.R.Rajagopalan, B.Srubar, W.V., (2019). A comparison of machine learning methods for predicting the compressive strength of field-placed concrete, Constr. Build. Mater. vol. 228, 116661.ecosystems on the health of respiratory system and the eyes
Ekhtesasi M. R , Sepehr A.( 2009). Investigation of wind erosion process for estimation, prevention, and control of DSS in  Yazd–Ardakan plain, Environ Monit Assess Environmental researches. 5(9): 177- 186. DOI 10.1007/s10661-008-0628-
Faghihinia and Afzali .(2013). Effects of wind erosion on soil organic carbon dynamics and other soil properties: Dejgah  catchment, Farashband County, Shiraz Province, Iran, September 2013,African Journal of Agricultural Research 8(34):4452-4459
Faraji, M. and Nadi, S. (2018). Assessment of aerosol optical depth of MODIS sensor data by using PM2.5meteorological  data in urban area. In proceeding of 3th spatial data of technology of engineering. Khaje Nasir Toosi University of technology, Tehran. (In Persian)
Feng, D.C. et al. (2020),, Machine learning-based compressive strength prediction for concrete: an adaptive boosting approach, Constr. Build. Mater. vol. 230.
Feng, J. Zhang, H. Gao, K. Liao, Y. Gao, W. Wu, G. (2022). Efficient creep prediction of recycled aggregate concrete via machine learning algorithms, Constr. Build.Mater. vol. 360 129497.
Friedman. J H. (2001). Greedy Function Approximation: A Gradient Boosting Machine. Institute of Mathematical Statistics Vol. 29, No. 5 (Oct., 2001), pp. 1189-1232 (44 pages).
Fu, M.; Wang, W.; Le, Z.; Khorram, M.S.( 2015). Prediction of particular matter concentrations by developed feed-forward  neural network with rolling mechanism and gray model. Neural Comput. Appl, 26, 1789–17.
Gardner, M. and Dorling, S.. (1998),\Arti_cial neural networks (the multilayer perceptron) { a review of applications in the  atmospheric sciences," At-mospheric Environment.
Gautam, S., Patra, A.K., Sahu, S.P., Hitch, M. (2018). Particulate matter pollution in opencast coal mining areas: a threat to human health and environment. International Journalof Mining, Reclamation and Environment 32, 75-92.
Ginoux P,  Prospero J.M.  , Torres O, Chin M,(2004). Long-term simulation of global dust distribution with the GOCART  model: correlation with North Atlantic Oscillation, Environmental Modelling & SoftwareVolume 19, Issue 2, February, Pages 113-128.
Gonzalez ,P., Wang ,F., Notaro, M., Vimont ,D. J. and Williams ,J.W.(2018). Disproportionate magnitude of climate change in United States national parks. , Published by IOP Publishing Ltd. Published 24 September 2018.
Ghunimat, D., Alzoubi, A. E., Alzboon, A., & Hanandeh, S. (2023). “Prediction of concrete compressive strength with GGBFS and fly ash using multilayer perceptron algorithm, random forest regression and k-nearest neighbor regression”, Asian Journal of Civil Engineering, 24 (1), 169-177.
Hadjimitsis, D. G., Clayton, C. R. I. and Hope, V. S. (2004). An assessment of the effectiveness ofatmospheric correction algorithms through theremote sensing of some reservoirs. InternationalJournal of Remote Sensing, 25(18), 3651-3674.
Holben, B. N., Tanre, D., Smirnov, A., Eck, T. F., Slutsker, I., Abuhassan, N., and Kaufman, Y. J. (2001). An emerging ground‐based aerosol climatology: Aerosol optical depth from AERONET. Journal of Geophysical Research: Atmospheres, 106(D11), 12067-12097.
Hoseini Tabesh, S., Aghashariatmadari, Z., & Hejabi, S. (2022). Assessment of MODIS Data in Monitoring the Concentrations of PM2.5 and PM10 Pollutants with Emphasis on Meteorological Variables. Iranian Journal of Soil and Water Research, 52(12), 2967-2983. doi: 10.22059/ijswr.2022.330907.66908(In Persian).
Hou,W.; Li, Z.; Zhang, Y.; Xu, H.; Zhang, Y.; Li, K.; Li, D.;Wei, P.; Ma, Y.( 2014).Using support vector regression to predict PM10 and PM2.5. IOP Conf. Ser. Earth Environ. Sci. 17, 012268.
Jiang, T., Chen, B., Nie, Z., Ren, Z., Xu, B., & Tang, S. (2021). Estimation of hourly full-coverage PM2.5 682
 concentrations at 1-km resolution in China using a two-stage random forest model. 683 Atmospheric Research, 248. https://doi.org/10.1016/j.atmosres.2020.105146
Jolliffe I. (2002).Principal component analysis: Wiley Online Library.
Kaviani Rad.A, Redmond R. Shamshiri  , Naghipour.A, Odeen Razmi .S , Shariati.M, Golkar.Sh  and Siva K. Balasundram, (2022), Machine Learning for Determining Interactions between Air Pollutants and Environmental Parameters in Three Cities of Iran, sustainability, https://doi.org/10.3390/su14138027.
Keprate, A.; Ratnayake, R.C. (2017). Using gradient boosting regressor to predict stress intensity factor of a crack propagating in small bore piping. In Proceedings of the 2017 IEEE International Conference on Industrial Engineering  and Engineering Management (IEEM), Singapore, 10–13 December 2017; IEEE: Piscataway, NJ, USA,; pp. 1331–1336.
Kumar.d, singh.m, kushwaha.m, makarana.v and yadav. M.r. (2021). Integrated use of organic and inorganic nutrient sources influences the nutrient content, uptake and nutrient use efficiencies of fodder oats (Avena sativa). ICAR-National Dairy Research Institute, Karnal, Haryana 132 , February 2021; Revised accepted: November 2021.
Khosroshahi, m, kashaki,m, ensafi moghadam,t.( 2009). Determination of climatological deserts in Iran, International Journal of Phytoremediation Vol, 16(1): 96-11.
Lanzaco, B. L., Olcese, L. E., Palancar, G. G. and Toselli. B. M. (2016). Aerosol and Air Quality Research. 16 1509-1522.
Lee, H. J., Chatfield, R. B. and Strawa, A. W. (2016). Enhancing the applicability of satellite remote sensing for PM2. 5 estimation using MODIS deep blue AOD and land use regression in California, United States. Environmental Science & Technology, 50(12), 6546-6555.
Li, H. Lin, J. Lei, X. Wei, T. (2022). Compressive strength prediction of basalt fiber reinforced concrete via random forest algorithm, Mater. Today Commun. vol. 30, 103117, https://doi.org/10.1016/J.MTCOMM.2021.103117.
Masoudi M, Asadifard E. and Rastegar M.) 2018(. Status of PM10 as an air pollutant and its prediction using meteorological parameters in Ahvaz, Iran, Environmental Resources Research Vol. 6, No. 2.
Middleton, N. J. (1986a). Dust storms in the Middle East. – Journal of Arid Environments 10 (2): 83–96.
Fratello, M, Tagliaferri, R.(2019). Decision Trees and Random Forests. Encyclopedia of Bioinformatics and Computational Biology (1) 2019: 374-383
Miri, A., Hasan, A., Ekhtesasi, M. R., Panjehkeh, N., and Ghanbari, A. (2009). Environmental and socio-economic impacts of dust storms in Sistan Region, Iran. The International Journal of Environmental Studies 66, 343–355.
Mustaqeem,M., Siddiqui,T., Ahmad Khan,.N., Kumar,.D.2023. In-Depth Analysis of Various Artificial Intelligence Techniques in Software Engineering: An Experimental Study. Journal of Information Technology Management, 2023, Vol. 15, Issue 3, pp. 162-181
Modarres, R., Sadeghi, S. (2018). Spatial and temporal trends of dust storms across desert areas of Iran. Natural Hazards 90, 101-114.
Naghibi, S.A., Pourghasemi, H.R., Dixon, B. (2016). GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and randomforestmachine learning models in Iran. Environ. Monit. Assess. 188, 44. https://doi.org/10.1007/s10661-015-5049-6.
O’LoingsighT, McTainsh G.H., TewsE. K ,  StrongC.L ,  Leys J.F.,  ShinkfieldP. Tapper, N.J. (2014).The Dust Storm Index (DSI): A method for monitoring broadscale wind erosion using meteorological records, Aeolian Research Volume 12, March 2014, Pages 29-40.of inhabitants of Sistan, East Iran
Olden, J.D., Lawler, J.J., Poff, N.L. (2008). Machine learning without tears: a primer for ecologists.Q. Rev. Biol. 83 (2), 171–193.
Oyewola, D.O.; Dada, E.G.; Misra, S.; Damaševiˇcius, R. Predicting COVID-19 Cases in South Korea with All K-Edited Nearest Neighbors Noise Filter and Machine Learning Techniques. Information 2021, 12, 528.ORIGINAL PAPER
Paciorek, C. J. and Liu, Y. (2009). Limitations of remotely sensed aerosol as a spatial proxy for fine particulate matter. Environmental health perspectives, 117(6), 904-909.
Pahlavan, A. Pahlavan, R. and Esmaeli, A. (2014). Estimating PM10 and PM2.5 in Tehran mega city using MODIS data of Terra and Aqua satellites. In: Proceedings of the first International Congress on Application of advanced models of spatial analysis (remote sensing and GIS) in land management, 24-25 Oct. Azad University, Yazd, Iran, pp.125138.( In Persian)
Plocoste T and Laventure S. (2023). Forecasting PM10 Concentrations in the Caribbean Area Using Machine Learning Models, Atmosphere 2023, 14(1), 134; https://doi.org/10.3390/atmos14010134.
Rashki, A., Arjmand, M, Kaskaoutis, D. G. (2017). Assessment of dust activity and dust-plume pathways over Jazmurian Basin, southeast Iran. Aeolian Research Journal, 24: 145–160.
Reynolds , James f.  , stafford smith, d. Mark, .Lambin, eric f, . Turner, b. L. iimortimore, michael ,  Batterbury , simon p. J.,Downing, thomas e. , dowlatabadi, hadi , . Fernández, robertoj, and walkr, brian . (2007). Global Desertification: Building a Science for Dryland Development, SCIENCE, 11 May 2007,Vol 316, Issue 5826,pp. 847-851DOI:   10.1126/science.1131634.
Rezaei M , J.P.M M. Riksen ,  Sirjani E ,  Sameni A,  Geissen V.( 2019).  Wind erosion as a driver for transport of light density microplastics, Science of The Total EnvironmentVolume 669, 15 June 2019, Pages 273-281.
Sadeghi, H., khaksar, S.) 2015(. Neural Network Model for Short Term Prediction of PM10 Pollution in Ahvaz City.  (In Persian).
Sahebzadeh B, Shabani-Goraji K, Shoaei Z, Afshari M. (2019). Statistical study of eolian sediment risk in human ecosystems on the health of respiratory system and the eyes of inhabitants of Sistan, Iran, Arabian Journal of Geosciences.
Shao, Y., and others (2003). Northeast Asian dust storms: Real-time numerical prediction and validation. Journal of Geophysical Research: Atmospheres, vol. 108, No. D22
Shrestha .N. 2020. Detecting Multicollinearity in Regression Analysis. American Journal of Applied Mathematics and Statistics, 8, 39-42.
Singh, A.; Kotiyal, V.; Sharma, S.; Nagar, J.; Lee, C.C. (2020). A machine learning approach to predict the average localization error with applications to wireless sensor networks. IEEE Access, 8, 208253–208263.
SONG Yand LU Y. (2015). Decision tree methods: applications for classification and prediction, Shanghai Arch Psychiatry. 2015 Apr 25; 27(2): 130–135.
Sotoudeheian, S., and Arhami, M. (2017). Using linear mixed effect model to estimate ground-level PM2.5: case study for Tehran. Iranian Journal of Health and Environment. 10 (2), 213-224. (In Persian)
Statistical study of eolian sediment distribution risk in human
Suleiman A., Tight M.R., Quinn A.D.  (2019). Applying machine learning methods in managing urban concentrations of traffic-related particulate matter (PM10 and PM2.5), Atmospheric Pollution ResearchVolume 10, Issue 1, January 2019, Pages 134-144.
Tahbaz Geophys Res  M. J. (2016).summertime dust source map from automated satellite dust plume tracking. Environmental Challenges in Today’s Iran, Iranian Studies, November 2016.
Tuna Tuygun,G., Gündoğdu,S., Elbir1,T,.(2021). Estimation of ground-level particulate matter concentrations based on synergistic use of MODIS, MERRA-2 and AERONET AODs over a coastal site in the Eastern Mediterranean, Version of Record: https://www.sciencedirect.com/science/article/pii/S1352231021003848
Ul-Saufie, A.Z.; Yahya, A.S.; Ramli, N.A.; Hamid, H.A. (2011). Comparison between multiple linear regression and feed forward back propagation neural network models for predicting PM10 concentration level based on gaseous and meteorological parameters.Int. J. Appl. Sci. Technol., 1, 42–49.
Willmott, C.J.; Robeson, S.M.; Matsuura, K. (2012).A refined index of model performance. Int. J. Climatol., 32, 2088–2094.
Xia, L.; Bai, R. (2008). Freight Vehicle Travel Time Prediction Using Gradient Boosting Regression Tree. In Proceedings of the IEEE International Conference on Machine Learning & Applications, Anaheim, CA, USA, 18–20 December 2016.
You, W., Zang, Z., Zhang, L. et al. (2016). Estimating national-scale ground-level PM25 concentration in China using geographically weighted regression based on MODIS and MISR AOD. Environ Sci Pollut Res, 23(9), 8327–8338.
Ye Ren, Le Zhang, and Ponnuthurai N Suganthan (2016) Ensemble classification and regression-recent developments, applications and future directions, IEEE Computational Intelligence Magazine, 11(1), pp. 41–53.
Zarei, T., Abdolzadeh, M., Yaghoubi, M. (2022). Comparing the impact of climate on dustaccumulation and power generation of PV modules: A comprehensive review.Energy Sustain. Dev. 66, 238–270.
Zieger, P., Weingartner, E., Henzing, J., Moerman, M., de Leeuw, G., Mikkilä, J., Ehn, M., Petäjä, T., Clémer, K., van Roozendael, M., Yilmaz, S., Frieß, U., Irie, H., Wagner, T., Shaiganfar, R., Beirle, S., Apituley, A., Wilson, K. and Baltensperger, U. (2011). Comparison of ambient aerosol extinction coefficients obtained from in- situ, MAX-DOAS and LIDAR measurements at Cabauw, Atmos. Chem. Phys., 11, 2603–2624.