Feature selection and model prediction of air quality using pm2.5
Loading...
Date
2018-06
Authors
Sharon Ding, Tiew Kui
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This study was to develop a feed-forward artificial neural network (FANN)
prediction model to predict the air quality using PM2.5. Currently, Malaysia does not
have any prediction model for concentration of PM2.5. Thus, with the prediction model
developed, the concentration of PM2.5 in air can be predicted by using meteorological
variables. The main parameter that investigated in this study was the number of neuron
of hidden layer. The performance of the prediction model was analysed and evaluated
by using mean square error (MSE) and Coefficient of Determination (R2
) values. With
the increasing of the number of neuron of hidden layer, MSE decreased and R
increased. 10 neuron of hidden layer gave the best performance among the number of
neuron investigated. Due to the low performance of the prediction model, feature
selection was introduced to remove irrelevant variables in data set. Random forest (RF)
was grew with 200 regression trees to decide the importance of the predictors. The
predictors which was less important were removed from the predictors. With the
removal of the irrelevant variables, the precision of the prediction model increased
with increased of the performance of the model. Besides that, the complexity of the
prediction model also reduced by decreasing training time of the prediction model.
The predictors removed by feature selection in this study were pressure, dew point,
hourly precipitation and cumulated precipitation. Thus, it was clearly seen that the
performance of prediction model with feature selection was better than prediction
model without feature selection.