Hội nghị Khoa học Công nghệ lần thứ 4 SEMREGG 2018 109 MANAGEMENT MODEL OF AIR QUALITY AND FORECAST OF SO2 AND NOX CONCENTRATION OF NHON TRACH INDUSTRIAL ZONE COMBINING NEURAL NETWORK AND GEOSTATISTIC[.]
Trang 1MANAGEMENT MODEL OF AIR QUALITY AND FORECAST OF SO2
AND NOX CONCENTRATION OF NHON TRACH INDUSTRIAL ZONE
COMBINING NEURAL NETWORK AND GEOSTATISTICS
TECHNOLOGY
Le Thi Thu Thao, Pham Hoang Thu Na, Pham Van Tat*
Department of Environmental Engineering, Hoa Sen University, Ho Chi Minh city
* Email: vantat@gmail.com
ABSTRACT
The management modelling of the emitted gases SO2 and NOx at Nhon Trach industrial zone was utilized to assess the air quality using the nonlinear multivariate regression, neural network and Kriging technique The operation of this industry zone in years 2011 to 2018 was developed quyckly extent The nonlinear multivariate models and the three-layer neural network architecture I(3)-HL(8)-O(2) were constructed to predict the concentration of gases SO2 and NOx The predicted values resulting from those models are compared with the monitored concentration at various
locations at Nhon Trach The nonlinear models were established with statistical parameters R2fit =
0.9334, RMSE = 0.0499, SSE = 0.6217, Fstat = 64775.9226 for gas SO2 and R2fit = 0.9704, RMSE =
0.03813, SSE = 0.7918, Fstat = 68385.4330 for gas NOx The neural network model I(3)-HL(8)-O(2)
was built with RMSE = 0.0344, R2train = 0.9777 for gas SO2 and RMSE = 0.0263, R2train = 0.9986 for gas NOx Geostatistical techniques and Kriging interpolation model were used to find the trend of air pollution at industrial zones of Nhon Trach The study results indicate that the neural network is able to give better predictions with less residual mean square error than those given by nonlinear multivariate models The obtained models can support effectively in the management of air quality
in industrial areas of Nhon Trach, Dong Nai
Keywords: Air pollution, geostatistical analysis, interpolation method, neural network,
multivariate analysis
1 INTRODUCTION
In recent years the industrial area of Nhon Trach district is in the status of air pollution at an alarming rate, due to the population grows rapidly which is followed by the development of industrial areas The air pollution is also increasing Recognizing the level of air pollution caused by the industrialization, since 1999 the station network of air monitoring has been located in residential, traffic and industrial zone with 46 positions In addition, it will continue to offer control solutions while adding several monitoring locations to industrial clusters by 2010 [1] So far, Nhon Trach district has 10 industrial zones, including Nhon Trach 1, 2, 2-D2D, 2-Loc Khang, 2-Nhon Phu, 3, 5, 6, Nhon Trach and Ong Keo This number will continue to increase in the future [2,3] Nowadays the studies implemented on dispersion of pollutants in atmospheric environment The various prediction techniques including gaussian and numerical models are generally used The inputs to dispersion models include emission, meteorological data and monitoring locations The output of these models is the predicted concentration of specified monitoring locations [4,5] The models are mainly based on the mathematical formulation of the physics and chemistry of the atmosphere Especially under normal conditions, when the pollutants disperse on horizontal
Trang 2direction due to changes in the wind direction over a one-hour duration could not be well represented by the Gaussian distribution [6] Even the dispersion models have some physical parameters and detailed information about the sources of pollutants But in this model other parameters are not generally known To surmount this, statistical models are also employed to simplify the prediction of pollutant concentrations [7,8] The statistical models could be the relationship between the variables in nature However, such models requyre information about the data distribution [6] Recently, neural network models have also been applied to predict pollutant concentrations The neural network model can be a better alternative to statistical models It can handle data having high dimensionality [9,10] Furthermore the geographic information system technique has been also widely used in several fields of environmental management, in order to analyze and manage the factors affecting the environment clearly, more accurate
In this work we report the construction of nonlinear multivariate models and neural networks
to predict the concentration of gases SO2 and NOx at the industrial zone of Nhon Trach These models are constructed by the relationship between impacting factors humidity (%), temperature (oC) and wind speed (km/h) with concentrations of gases SO2 and NOx The geostatistics technique
is used to identify the contaminated areas by air pollutants and the trend of the neighboring location
in an industrial area in Nhon Trach The obtained output resulting from those models is compared to the monitoring data The interpolation technique of Kriging model is used to evaluate to find the trend of pollutants in industrial areas in Nhon Trach
2 MATERIALS AND METHODS 2.1 Dataset
The monitoring stations of air quality in Nhon Trach consist of 11 monitoring positions To test the air quality we performed the calculations of AQI index for substances SO2 and NOx The monitoring stations are distributed close by industrial areas in Nhon Trach, such as Ong Keo, Nhon Trach, Dai Phuoc, Long Phuoc, the industrial center Phu Thanh Vinh Thanh The monitoring subjects are used to create the various models and to chart the trend of pollution levels of each gas
in accordance the years 2011 to 2018 The program Arcgis10.2 is used to map for the Nhon Trach location of Dong Nai and the monitoring locations of Nhon Trach as well as providing information
on the characteristics of GIS map The dataset including spatial and attribute data as the x-y coordinate of monitoring stations, concentration of gases SO2, NOx and meteorological data are used to establish the shape file of the study area at Nhon Trach of Dong Nai province The monitoring stations with sample notation are given in Table 1 The map of the study area at Nhon Trach is presented in Figure 1
Table 1 The location of monitoring stations with sample notation of industrial zone
Trang 3No Monitoring station Coordinate Sample notation
The map layout of study area and air-monitoring stations at Nhon Trach are demonstrated by program ArcGIS
Figure 1 Study area map Nhon Trach and location of air-monitoring stations at Nhon Trach
2.2 Methods
2.2.1 Nonlinear multivariable model
A nonlinear multivariate model was developed by the relationship between factors and gas concentration to compare the performance of the neural network The data were first checked for nonlinear regression analysis For this, all the variables were examined for autocorrelations among themselves The noise of the concentration data was removed using a log transformation The wind speed, temperature and relative humidity as input variables of nonlinear model and patterns of the logarithm of concentration of gases SO2 and NOx as the output variables, the nonlinear models were tested as proposed by [9,10]
The nonlinear model was developed by using Ljung-Box statistics [5,11] to examine the adequacy of the model The suitable model equation is
yn+1 = b1/yn H n + b2 (b3 H n + b4)b5 + b6 (b7 T n + b8 R n + b9)-b10 (1) Where yn is the log of concentration (mg/m3) of gases SO2 and NOx; R n is the wind speed
(km/h), T n is the temperature (oC); H n is the relative humidity, %; suffix n denotes the year; and b 1 ,
coefficients can be determined by the least squares technique using an Advanced Differential Evolution algorithm with population size of 20, mutation rate of 0.85 and crossover rate of 0.7
Trang 42.2.2 Neural network model
A neural network is a biologically motivated structure whose ith neuron has an input value x i,
output value y i = f(x i ), and connections with the other neurons are described by weights w ij A
three-layer network I(k)-HL(m)-O(n) with one hidden three-layer is given in Fig 4 A brief description of
neural networks is given in [12,13] Generalization and error tolerance are the main features of neural networks [14]
The neural network I(k)-HL(m)-O(n) was trained and tested using program JMP Pro 13 A
three-layer neural network consisted of an input layer, an output layer and one hidden layer A typical Elman network is presented in Fig 4 Each layer has a number of nodes called neurons The nodes in the input layer distribute the input signals to the network The nodes in the output layer are characteristics of air quality Each node in the hidden and output layers has an activation function, which transfers the node input to an output signal The output is a function of the inputs to the first layer The sigmoid transfer function from -1 to 1 is given as
2
x y
Where y is the node output and x is the total node input
2.2.3 Geostatistics method
Geostatistics is a class of statistics used to analyze and predict the values associated with spatial or spatiotemporal phenomena It incorporates the spatial coordinates of the data within the analyses [15] The geostatistical tools were developed as a practical means to describe spatial patterns and interpolate values for locations where samples were not taken Those tools and methods have since evolved to not only provide interpolated values, but also measures of uncertainty for those values [16,17] Geostatistical analysis has also evolved from uni- to multivariate and offers mechanisms to incorporate secondary datasets that complement a primary variable of interest, thus allowing the construction of more accurate interpolation and uncertainty models [15]
Kriging model assumes that at least some of the spatial variation observed in natural phenomena can be modeled by random processes with spatial autocorrelation, and requyre that the spatial autocorrelation be explicitly modeled Kriging techniques can be used to describe and model spatial patterns, predict values at unmeasured locations, and assess the uncertainty The progress is expressed through the steps:
- Collecting data attributes spatial vector, the air pollution parameters were observed of the Nhon Trach district in years 2011 to 2018
- Building the base map of Dong Nai provincial boundary, rivers and lakes, roads,…
- Calculating Air Quality Index AQI for gases SO2 and NOx in the study area
- Interpolation of AQI index under Kriging interpolation method
- Validating the accuracy and the standard deviation of the interpolation results
3 RESULTS AND DISCUSSION 3.1 Dataset of air monitoring quality
The air quality index AQI is calculated separately from the data of automatically each air monitoring station for the ambient air environment; AQI value is calculated for each monitoring
Trang 5parameter of environmental quality Each environmental parameter is used to determine a specific AQI value; the final AQI value is the maximum of the AQI values for each parameter; the scale of AQI value is divided into certain ranges When the AQI value is within a certain range, the warning message for the community for that value range will be given [18]
To have fully validation of air quality of industrial zone at Nhon Trach, AQI values for SO2 and NOx are calculated with a handbook for calculation of air quality index (AQI) issued together with Decision No 878 / QD-TCMT July 1, 2011 of the Director General for the General of Department of Environment [18] AQI average values of gases SO2 and NOx were calculated for the monitoring stations in Nhon Trach district of Dong Nai province in years 2011 to 2018, as given in Fig 1 Fig 1 showed that, the overall SO2 parameter from 2011 until now remained in the safety level according to AQI calculated values In case of gas NOx the AQI average values of NOx showed the level of air pollution in the year 2011 beyond the AQI criterion according to the AQI criterion of Decision No 878 / QD-TCMT But in the next years AQI values tend to fall
in industrial zone of Nhon Trach over years 2011 to 2018
3.2 Nonlinear multivariate model
To construct the management model of air quality the nonlinear multivariate models for important gases SO2 and NOx need to be established to predict the their concentration [5, 9] The coefficients of equation (1) were determined by the least squares technique using an Advanced Differential Evolution algorithm with population size of 20, mutation rate of 0.85 and crossover rate
of 0.7 This new evolution algorithm is used in this work The quality of nonlinear multivariate regression models for gases SO2 and NOx is presented by equation (3) and (4) For these nonlinear
models the meteorological variables H (%), T(oC), R(km/h) at one location are also used as an input
variable to all the locations For validation of the models, the RMSE values are used, calculated as
RMS = (SSQ/n)1/2, where n is the number of residuals and SSQ the sum of squared residuals The
RMSE values for the concentrations fitted at industrial sites are 0.04999 for SO2 and 0.03813 for
NOx, respectively, and the predicted values it is R2pred = 0.9602 for SO2 and R2pred = 0.9776, respectively The fitted Fig 3 illustrates the predicted results obtained using the nonlinear multivariate models The correlation coefficients between the monitored and fitted data for monitoring sites are 0.9334 for SO2 and 0.9704 for NOx
A nonlinear multivariate regression model is constructed for gas SO2 in an industrial zone
10
20
30
40
50
2011 2012 2013 2014 2015 2016 2017 2018
Sample notation
20 40 60 80
Sample notation
2011 2012 2013 2014 2015 2016 2017 2018
criterion
Trang 6yn+1 = 0.016/yn H n + 142.014 (1.346 H n + 214.673)-0.450 - 129.635 (-9.342 T n
- 20.893 R n + 27543.100)-0.243 (3)
R2fit = 0.9334, RMSE = 0.04999, SSE = 0.6217, Fstat = 64775.9226
A nonlinear multivariate regression model is constructed for gas NOx in an industrial zone
yn+1 = 0.015/yn H n + 56.360 (0.005 H n + 2.933)-0.419 - 4.543 (1.015 T n
R2fit = 0.9704, RMSE = 0.03813, SSE = 0.7918, Fstat = 68385.4330
Figure 3 Fitting plot between the monitored, fitted and predicted concentration of gases
SO2 and NOx at industrial zone using nonlinear multivariate model
3.3 Construction of neural network
A neural network is constructed to train the monitoring data of industrial zone collected at 11 monitoring locations, as given Table 1 Owing to variations in micrometeorological data, it is preferable to use for prediction of air quality; however, it was possible to train either the separate neural networks or the together neural network The input parameters for the nonlinear models
include wind speed, R (km/h), temperature, T (°C) and relative humidity, H (%) The output
parameters consist of concentrations of gases SO2 and NOx The neural network architecture I(k)-HL(m)-O(n) is constructed carefully by considering neurons of the hidden layer It may be possible
to validate the predictability of the gas concentrations at the monitoring stations in years 2011 to
2018 The convergence of the neural network enables the selection of the optimum neural network structure and also the number of neurons in the hidden layer The parameters were used for training process are the sigmoid transfer function, learning rate 0.1, momentum 0.7, target epochs 10000 and target MSE 0.0001 The neural network architecture I(3)-HL(8)-O(2) is chosen for management modeling of air quality at the industrial zone of Nhon Trach, as shown in Fig 4 This neural network enables to adapt the flexible predictability
2011 2012 2013 2014 2015 2016 2017 0.5
1.0
1.5
2 -Monitoring
CSO
2 -Fitted
Năm
2011 2012 2013 2014 2015 2016 2017 1.0
1.5 2.0
2.5
CNO
x -Monitoring
CNO
x -Fitted
Năm
2012 2013 2014 2015 2016 2017 2018 1.0
1.5
2.0
CSO
2 -Monitoring
CSO
2 -Predicted
Năm
2012 2013 2014 2015 2016 2017 2018 1.0
1.5 2.0
2.5
C NOx-Monitoring
CNO
x -Predicted
Năm
Trang 7Figure 4 The neural network architecture I(3)-HL(8)-O(2) with three nodes in input layer as H (%),
T (oC) and R (km/h), and two nodes of output layer as concentration of SO2 and NOx
The training and prediction quality of this neural network I(3)-HL(8)-O(2) is pointed out by using the monitoring data of 11 locations in years 2011 to 2018, as showed in Fig 5 In training process the data set was partitioned into the training and test set randomly For monitoring data on the training set, the correlation coefficients between monitored and training values for gases SO2 and NOx are R2train of 0.9777 and R2train of 0.9986, respectively The statistical errors of neural network I(3)-HL(8)-O(2) for training and test process are presented by values RMSE of 0.0344 and 0.0231 of gas SO2, and RMSE of 0.0263 and 0.0435 for gas NOx, respectively, as exhibited in Fig.5 and Table 2
industrial zone using neural network I(3)-HL(8)-O(2)
The results show that the neural network is able to train the data set and give the predictions accurately The discrepancy between actual and test lines in Fig 5 resulting from this neural network I(3)-HL(8)-O(2) is insignificant This is also shown in statistical values in Table 2
2011 2012 2013 2014 2015 2016 2017 2018 1.0
1.5
2.0
CSO
2 -Monitoring
CSO
2 -training
Năm
2011 2012 2013 2014 2014 2015 2016 2017 2018 1.0
1.5 2.0
2.5
CNO
x -Monitoring
CNO
x -training
Năm
2012 2013 2014 2015 2016 2017 2018 1.0
1.5
2.0
CSO
2 -Monitoring
CSO
2 -Test
Năm
2011 2012 2013 2014 2014 2015 2016 2017 2018 1.0
1.5 2.0
2.5
CNO
x -Monitoring
CNO
x -Test
Năm