Development Of Water Quality Index Prediction Model For Penang Rivers Using Artificial Neural Network
Loading...
Date
2021-07-01
Authors
Mohd Hamdan, Eleena Yasmeen
Journal Title
Journal ISSN
Volume Title
Publisher
Universiti Sains Malaysia
Abstract
As a consequence of industrialization and urbanization, industries might
discharge effluent locally without treatment into rivers, lakes and ocean. Pulau Pinang
is a very cosmopolitan city that indirectly raise concern on the issue of water quality
problems. Considering the issue encountered, the water quality index (WQI)
formulation developed by the Department of the Environment (DOE) might be able to
assist the water authorities in some way, as it is useful to assess the river water quality
condition. Thereupon, in this study, the WQI prediction model for Penang rivers has
been developed by using Artificial Neural Network (ANN) architecture in MATLAB.
There are 30 water quality parameters with a total of 1000 samples were obtained from
DOE for further model implementation. In order to reduce the redundancy of the input
parameters, principal component analysis (PCA) has been introduced. Consequently,
the performance of the proposed model was validated using unseen data. To achieve
those objectives, there were three main phases implemented in the network
development framework; first, the determination of feature extraction using multiway
principal component analysis (MPCA), second, the ANN model development and
architecture selection for BOD and COD analysis, and third, the ANN architecture
selection for WQI prediction model. As for the implementation of MPCA in feature
extraction for BOD and COD, there were only 4 inputs required to explain at least
99.999% variability for both analyses. Altogether, for BOD, the BR algorithm with 60%
training and 12 hidden nodes gives R=0.7825 whereas for COD, the BR algorithm with
70% training and 10 hidden nodes gives R=0.6716. For ANN prediction model
development, four scenarios of the train-validate-test process to minimize model
overfitting with 15 hidden nodes based on three built-in algorithms namely Levenberg Marquart (LM), Bayesian Regularization (BR) and Scaled Conjugate Gradient (SCG)
were created. As a result, the BR algorithm was chosen for both BOD and COD analysis
as it can generate a good network that generalizes well enough by minimizing the
combination of errors and weights. Following the architecture selection phase, three
sub-scenarios assumed from the number of hidden neurons; 15, 30 and 45 nodes were
introduced into previous train-validate-test scenarios with chosen BR algorithm. On the
whole, BR algorithm with 60% training and 30 hidden nodes were successfully
developed for BOD analysis, meanwhile, 70% training for COD analysis with the
regression values of 0.9978 and 0.9976 respectively. Prior to the development of ANN based WQI prediction model, the BR algorithm was chosen with two-, three-, four-,
five- and six-neuron architectures for 60% and 70% training. As a result, 60% training
with five hidden nodes demonstrated the best performance with R- value of 0.827 and
MSE value of 52.283. The ANN-based models could serve as reliable and useful tools
in estimating the WQI of the river.