Model Prediction Of Pm2.5 And Pm10 Using Machine Learning Approach
Loading...
Date
2021-07-01
Authors
Hamid, Norfarhanah
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
This study was done to develop a multi-input-single-output (MISO) and multi-input-multi-output (MIMO) models using an artificial neural network by MATLAB
software to predict the concentrations of PM2.5 and PM10 respectively based on
meteorological parameters. For the purpose of this research, the historical dataset is
obtained from the Beijing Municipal Environmental Monitoring Centre to be used as
the case study. The model was developed as a generic use where data pre-processing
using two separate methods of calculating a correlation coefficient and variable
importance in projection (VIP) scores managed to select significant input toward
output for model development. Both methods of feature selection produced similar
results where gaseous pollutants of Carbon Monoxide (CO), Nitrogen Dioxide (NO2)
and Sulfur Dioxide (SO2) demonstrated the highest correlation towards the output
target. Based on the feature selection, model development was built with and without
input selection using the Nonlinear Autoregressive with Exogeneous Input (NARX)
neural network model which made use of 10 number of hidden neurons and 2 number
of delays, implementing Levenberg-Marquardt as training algorithm. The performance
of the prediction model was evaluated by measuring Means Square Error (MSE), Root
Mean Square Error (RMSE), Regression Number (R), and Coefficient of
Determination (R2) values as a performance validation. Models developed with and
without input selections were studied and compared where MISO Model 1, without
input selection obtained the best performance having MSE, RMSE, R and R2 with
values of 0.0594, 0.2437, 0.9704 and 0.9417 respectively for testing. Meanwhile, with
input selection the values obtained 0.0589, 0.2428, 0.9709 and 0.9427. It was found
that taking into account the removal of the irrelevant variables does not increase
precision significantly nor does it reduce the performance tremendously. Instead,
knowing the key parameters with the most relation with PM2.5 and PM10 would
guarantee a better predicament of the concentration. Prediction of PM2.5 and PM10
concentration using machine learning is achieved and useful not only to improve
public awareness but the air quality management in Malaysia as well as other parts of
the world.