Publication: Development of water quality index prediction model for penang rivers using artificial neural network
Loading...
Date
2021-07-01
Authors
Mohd Hamdan, Eleena Yasmeen
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
As a consequence of industrialization and urbanization, industries might discharge effluent locally without treatment into rivers, lakes and ocean. Pulau Pinang
is a very cosmopolitan city that indirectly raise concern on the issue of water quality problems. Considering the issue encountered, the water quality index (WQI)
formulation developed by the Department of the Environment (DOE) might be able to assist the water authorities in some way, as it is useful to assess the river water quality condition. Thereupon, in this study, the WQI prediction model for Penang rivers has been developed by using Artificial Neural Network (ANN) architecture in MATLAB. There are 30 water quality parameters with a total of 1000 samples were obtained from DOE for further model implementation. In order to reduce the redundancy of the input parameters, principal component analysis (PCA) has been introduced. Consequently, the performance of the proposed model was validated using unseen data. To achieve those objectives, there were three main phases implemented in the network development framework; first, the determination of feature extraction using multiway principal component analysis (MPCA), second, the ANN model development and
architecture selection for BOD and COD analysis, and third, the ANN architecture selection for WQI prediction model. As for the implementation of MPCA in feature extraction for BOD and COD, there were only 4 inputs required to explain at least 99.999% variability for both analyses. Altogether, for BOD, the BR algorithm with 60% training and 12 hidden nodes gives R=0.7825 whereas for COD, the BR algorithm with 70% training and 10 hidden nodes gives R=0.6716. For ANN prediction model development, four scenarios of the train-validate-test process to minimize model overfitting with 15 hidden nodes based on three built-in algorithms namely Levenberg Marquart (LM), Bayesian Regularization (BR) and Scaled Conjugate Gradient (SCG) were created. As a result, the BR algorithm was chosen for both BOD and COD analysis as it can generate a good network that generalizes well enough by minimizing the combination of errors and weights. Following the architecture selection phase, three sub-scenarios assumed from the number of hidden neurons; 15, 30 and 45 nodes were
introduced into previous train-validate-test scenarios with chosen BR algorithm. On the whole, BR algorithm with 60% training and 30 hidden nodes were successfully developed for BOD analysis, meanwhile, 70% training for COD analysis with the regression values of 0.9978 and 0.9976 respectively. Prior to the development of ANN based WQI prediction model, the BR algorithm was chosen with two-, three-, four-, five- and six-neuron architectures for 60% and 70% training. As a result, 60% training with five hidden nodes demonstrated the best performance with R- value of 0.827 and MSE value of 52.283. The ANN-based models could serve as reliable and useful tools in estimating the WQI of the river.