American Journal of Intelligent Systems
p-ISSN: 2165-8978 e-ISSN: 2165-8994
2014; 4(2): 43-72
doi:10.5923/j.ajis.20140402.03
Vincent A. Akpan1, Reginald A. O. Osakwe2
1Department of Physics Electronics, The Federal University of Technology, P.M.B. 704, Akure, Ondo State, Nigeria
2Department of Physics, The Federal University of Petroleum Resources, P.M.B. 1221 Effurun, Delta State, Nigeria
Correspondence to: Vincent A. Akpan, Department of Physics Electronics, The Federal University of Technology, P.M.B. 704, Akure, Ondo State, Nigeria.
| Email: | ![]() |
Copyright © 2014 Scientific & Academic Publishing. All Rights Reserved.
This paper presents the formulation and application of an online adaptive recursive least squares (ARLS) algorithm to the nonlinear model identification of the five biological reactor units of an activated sludge wastewater treatment (AS-WWTP). The performance of the proposed ARLS algorithm is compared with the so-called incremental backpropagation (INCBP) which is also an online identification. The algorithms are validated by one-step and five-step ahead prediction methods. The performance of the two algorithms is assessed by using the Akaike’s method to estimate the final prediction error (AFPE) of the regularized criterion. Furthermore, the validation results show the superior performance of the proposed ARLS algorithm in terms of much smaller prediction errors when compared to the INCBP algorithm.
Keywords: Activated sludge wastewater treatment plant (AS-WWTP), Adaptive recursive least squares (ARLS), Artificial neural network (ANN), Benchmark simulation model No. 1 (BSM #1), Biological reactors, Effluent tank, Incremental backpropagation (INCBP), Nonlinear model identification, Nonlinear neural network autoregressive moving average with exogenous input (NNARMAX) model, Secondary settler and clarifier
Cite this paper: Vincent A. Akpan, Reginald A. O. Osakwe, Multivariable NNARMAX Model Identification of an AS-WWTP Using ARLS: Part 1 – Dynamic Modeling of the Biological Reactors, American Journal of Intelligent Systems, Vol. 4 No. 2, 2014, pp. 43-72. doi: 10.5923/j.ajis.20140402.03.
![]() | Table 1. The AS-WWTP Nomenclatures and Parameter Definitions |
![]() | Figure 1. The schematic of the AS-WWTP process |
with disturbance
can be represented by the following Nonlinear AutoRegressive Moving Average with eXogenous inputs (NARMAX) model [30], [31]:![]() | (1) |
is a nonlinear function of its arguments, and
are the past input vector,
are the past output vector,
is the current output,
and
are the number of past inputs and outputs respectively that define the order of the system, and
is time delay. The predictor form of (1) based on the information up to time
can be expressed in the following compact form as [30], [31]:![]() | (2) |
is the regression (state) vector,
is an unknown parameter vector which must be selected such that
,
is the error between (1) and (2) defined as![]() | (3) |
in
of (2) is henceforth omitted for notational convenience. Not that
is the same order and dimension as
.Now, let
be a set of parameter vectors which contain a set of vectors such that:![]() | (4) |
is some subset of
where the search for
is carried out;
is the dimension of
;
is the desired vector which minimizes the error in (3) and is contained in the set of vectors
;
are distinct values of
; and
is the number of iterations required to determine the
from the vectors in
.Let a set of
input-output data pair obtained from prior system operation over NT period of time be defined:![]() | (5) |
is the sampling time of the system outputs. Then, the minimization of (3) can be stated as follows:![]() | (6) |
is formulated as a total square error (TSE) type cost function which can be stated as:![]() | (7) |
as an argument in
is to account for the desired model
dependency on
. Thus, given as initial random value of
, m, n and (5), the system identification problem reduces to the minimization of (6) to obtain
. For notational convenience,
shall henceforth be used instead of
.
as the desired model of network and having the DFNN architecture shown in Fig. 2. The proposed NN model identification scheme based on the teacher-forcing method is illustrated in Fig. 3. Note that the “Neural Network Model” shown in Fig. 3 is the DFNN shown in Fig. 2. The inputs to NN of Fig. 3 are
,
and
which are concatenated into
as shown in Fig. 2. The output of the NN model of Fig. 3 in terms of the network parameters of Fig. 2 is given as:![]() | Figrue 2. Architecture of the dynamic feedforward NN (DFNN) model |
![]() | Figure 3. NN model identification based on the teacher-forcing method |
![]() | (8) |
and
are the number of hidden neurons and number of regressors respectively;
is the number of outputs,
and
are the hidden and output weights respectively;
and
are the hidden and output biases;
is a linear activation function for the output layer and
is an hyperbolic tangent activation function for the hidden layer defined here as:![]() | (9) |
is a collection of all network weights and biases in (8) in term of the matrices
and
. Equation (8) is here referred to as NN NARMAX (NNARMAX) model predictor for simplicity.Note that
in (1) is unknown but is estimated here as a covariance noise matrix,
Using
, Equation (7) can be rewritten as:![]() | (10) |
is a penalty norm and also removes ill-conditioning, where
is an identity matrix,
and
are the weight decay values for the input-to-hidden and hidden-to-output layers respectively. Note that both
and
are adjusted simultaneously during network training with
and are used to update
iteratively. The algorithm for estimating the covariance noise matrix and updating
is summarized in Table 2. Note that this algorithm is implemented at each sampling instant until
has reduced significantly as in step 7).![]() | (11) |
![]() | (12) |
denotes the value of
at the current iterate
is the search direction,
and
are the Jacobian (or gradient matrix) and the Gauss-Newton Hessian matrices evaluated at
.As mentioned earlier, due to the model
dependency on the regression vector
, the NNARMAX model predictor depends on a posteriori error estimate using the feedback as shown in Fig. 2. Suppose that the derivative of the network outputs with respect to
evaluated at
is given as![]() | (13) |
![]() | (14) |
![]() | (15) |
, then (15) can be reduced to the following form![]() | (16) |
which depends on the prediction error based on the predicted output. Equation (16) is the only component that actually impedes the implementation of the NN training algorithms depending on its computation.Due to the feedback signals, the NNARMAX model predictor may be unstable if the system to be identified is not stable since the roots of (16) may, in general, not lie within the unit circle. The approach proposed here to iteratively ensure that the predictor becomes stable is summarized in the algorithm of Table 3. Thus, this algorithm ensures that roots of
lies within the unit circle before the weights are updated by the training algorithm proposed in the next sub-section.
progressively in a first-in first-out fashion into a sliding window, 2)
is updated after a complete sweep through
, and 3) all
is repeated
times. Thus, Equation (10) can be expressed as [27], [35]:![]() | (17) |
is the exponential forgetting and resetting parameter for discarding old information as new data is acquired online and progressively added to the set
.Assuming that
minimized (17) at time
; then using (17), the updating rule for the proposed ARLS algorithm can be expressed from (11) as:![]() | (18) |
and
given respectively as:
![]() | (19) |
is computed according to (16).In order to avoid the inversion of
, Equation (19) is first computed as a covariance matrix estimate,
, as ![]() | (20) |
By setting
,
and
, Equation (20) can also be expressed equivalently as![]() | (21) |
is the adaptation factor given by
and
is an identity matrix of appropriate dimension,
and
are four design parameters are selected such that the following conditions are satisfied [27], [35], [36]:![]() | (22) |
in
adjusts the gain of the (21),
is a small constant that is inversely related to the maximum eigenvalue of P(k),
is the exponential forgetting factor which is selected such that
and
is a small constant which is related to the minimum
and maximum
eigenvalues of (21) given respectively as [27], [35], [36]:![]() | (23) |
and
in (22) is selected such that
while the initial value of
, that is
, is selected such that
[27].Thus, the ARLS algorithm updates based on the exponential forgetting and resetting method is given from (18) as![]() | (24) |
. Note that after
has been obtained, the algorithm of Table 2 is implemented the conditions in Step 7) of the Table 2 algorithm is satisfied.
portion of (3) using the trained network with
and taking the expectation
with respect to
and
leads to the following AFPE estimate [31], [36]:![]() | (25) |
and
is the trace of its arguments and it is computed as the sum of the diagonal elements of its arguments,
and
is a positive quantity that improves the accuracy of the estimate and can be computed according to the following expression:
The third method is the K-step ahead predictions [10] where the outputs of the trained network are compared to the unscaled output training data. The K-step ahead predictor follows directly from (8) and for
and
, takes the following form:![]() | (26) |

The mean value of the K-step ahead prediction error (MVPE) between the predicted output and the actual training data set is computed as follows:![]() | (27) |
corresponds to the unscaled output training data and
the K-step ahead predictor output.![]() | Figure 4. The multivariable neural network-based NNARMAX model identification scheme for the five biological reactors of the AS-WWTP with the proposed manipulated inputs and controlled outputs |
,
,
,
,
,
,
,
,
,
,
,
, and
out of which four states are measurable namely:
(readily biodegradable substrate),
(active heterotrophic biomass),
(oxygen) and
(nitrate and nitrite nitrogen). An additional important parameter
is used to assess the amount of soluble solids in all the reactors including aerobic reactor of Unit 5.As highlighted above, the main objective here is on the efficient neural network model identification to obtain a multivariable NNARMAX model equivalent of the activated sludge wastewater treatment plant (AS-WWTP) with a view in using the obtained model for multivariable adaptive predictive control of the AS-WWTP process in our future work. Thus, from Section 2, the measured inputs that influence the behaviour of the AS-WWTP process shown in Fig. 5 are:![]() | Figure 5. The neural network model identification scheme for AS-WWTP based on NNARMAX model |
![]() | (28) |
![]() | (29) |
for the NNARMAX models predictors discussed in Section 4 and defined here as follows:![]() | (30) |
![]() | (31) |
![]() | (32) |
![]() | (33) |
![]() | (34) |
affecting the AS-WWTP are incorporated into dry-weather data provided by the COST Action Group, additional sinusoidal disturbances with non-smooth nonlinearities are introduced in the last sub-section of this section to further investigate the closed-loop controllers’ performances based on an updated neural network model at each sampling time instants.![]() | Figure 6. Open-loop steady-state benchmark simulation model No.1 (BSM1) with constant influent |
on the left hand side of (20) in Section 3.3.1under the formulation of the ARLS algorithm; that is: ![]() | (35) |
is the step size and
is an identity matrix of appropriate dimension. Next, the basic back-propagation given from [27] as:![]() | (36) |
and carry out the recursive computation of the gradient given by (36).![]() | (37) |
and
,
are the mean and standard deviation of the input and output training data pair; and
and
are the scaled inputs and outputs respectively. Also, after the network training, the joint weights are rescaled according to the expression![]() | (38) |
and
shall be used.
defined by (33). The input
, that is the initial error estimates
given by (32), is not known in advance and it is initialized to small positive random matrix of dimension
by
. The outputs of the NN are the predicted values of
given by (34).For assessing the convergence performance, the network was trained for
= 100 epochs (number of iterations) with the following selected parameters:
,
,
,
,
,
(NNARMAX),
,
,
and
. The details of these parameters are discussed in Section 3; where
and
are the number of inputs and outputs of the system,
and
are the orders of the regressors in terms of the past values,
is the total number of regressors (that is, the total number of inputs to the network),
and
are the number of hidden and output layers neurons, and
and
are the hidden and output layers weight decay terms. The four design parameters for adaptive recursive least squares (ARLS) algorithm defined in (22) are selected to be: α=0.5, β=5e-3,
=1e-5 and π=0.99 resulting to γ=0.0101. The initial values for ēmin and ēmax in (23) are equal to 0.0102 and 1.0106e+3 respectively and were evaluated using (23). Thus, the ratio of ēmin/ēmax from (23) is 9.9018e+4 which imply that the parameters are well selected. Also,
is selected to initialize the INCBP algorithm given in (36).The 1345 dry-weather training data is first scaled using equation (37) and the network is trained for
epochs using the proposed adaptive recursive least squares (ARLS) and the incremental back-propagation (INCBP) algorithms proposed in Sections 3.3 and 4.2.3. After network training, the trained network is again rescaled respectively according to (38), so that the resulting network can work or be used with unscaled AS-WWTP data. Although, the convergence curves of the INCBP and the ARLS algorithms for 100 epochs each are not shown but the minimum performance indexes for both algorithms are given in the third rows of Tables 4, 5, 6, 7 and 8 for the five reactors. As one can observe from these Tables, the ARLS has smaller performance index when compared to the INCBP which is an indication of good convergence property of the ARLS at the expense of higher computation time when compared the small computation time used by the INCBP for 100 epochs as evident in the first rows of Tables 4, 5, 6, 7 and 8.![]() | Table 4. Anaerobic reactor (Unit 1) |
![]() | Table 5(a). Anoxic reactor (Unit 2) |
![]() | Table 5(b). Anoxic reactor (Unit 2) |
![]() | Table 6. First aerobic reactor (Unit 3) |
![]() | Table 7. Second aerobic reactor (Unit 4) |
![]() | Table 8(a). Third aerobic reactor (Unit 5) |
![]() | Table 8(b). Third aerobic reactor (Unit 5) |
![]() | Figure 7(a). One-step ahead prediction of scaled SNO1, QIN (Qinf), QR1 (Qras) and KLa1 training data |
![]() | Figure 7(b). One-step ahead prediction of scaled SNO2, XP2, QA1 (Qirf), SND2 and KLa2 training data |
![]() | Figure 7(c). One-step ahead prediction of scaled SO3, KLa3, SNO3 and XP3 training data |
![]() | Figure 7(d). One-step ahead prediction of scaled SO4, KLa4, SNO4 and SS4 training data |
![]() | Figure 7(e). One-step ahead prediction of scaled SO5, KLa5, QA2 (Qirn), QF1 (Qffr), SND5 and SNH5 training data |
![]() | Figure 8(a). One-step ahead prediction of unscaled SNO1, QIN (Qinf), QR1 (Qras) and KLa1 validation data |
![]() | Figure 8(b). One-step ahead prediction of unscaled SNO2, XP2, QA1 (Qirf), SND2 and KLa2 validation data |
![]() | Figure 8(c). One-step ahead prediction of unscaled SO3, KLa3, SNO3 and XP3 validation data |
![]() | Figure 8(d). One-step ahead prediction of unscaled SO4, KLa4, SNO4 and SS4 validation data |
![]() | Figure 8(e). One-step ahead prediction of unscaled SO5, KLa5, QA2 (Qirn), QF1 (Qffr), SND5 and SNH5 validation data |
![]() | Figure 9(a). 5-step ahead prediction of unscaled SNO1, QIN (Qinf), QR1 (Qras) and KLa1 training data |
![]() | Figure 9(b). 5-step ahead prediction of unscaled SNO2, XP2, QA1 (Qirf), SND2 and KLa2 training data |
![]() | Figure 9(c). 5--step ahead prediction of unscaled SO3, KLa3, SNO3 and XP3training data |
![]() | Figure 9(d). 5-step ahead prediction of unscaled SO4, KLa4, SNO4 and SS4 training data |
![]() | Figure 9(e). 5-step ahead prediction of unscaled SO5, KLa5, QA2 (Qirn), QF1 (Qffr), SND5 and SNH5 training data |
incorporating thirteen different components [1], [27]–[29]. These components are classified into soluble components
and particulate components
. The nomenclatures and parameter definitions used for describing the AS-WWTP in this work are given in Table 1. The Moreover, four fundamental processes are considered: the growth and decay of biomass (heterotrophic and autotrophic), ammonification of organic nitrogen and the hydrolysis of particulate organics. The typical schematic of the AS-WWTP is shown in Fig. 1.
|
: Aerobic growth of heterotrophs![]() | (A.1) |
: Anoic growth of heterotrophs![]() | (A.2) |
: Aerobic growth of autotrophs![]() | (A.3) |
: Decay of heterotrophs![]() | (A.4) |
: Decay of autotrophs![]() | (A.5) |
: Ammonification of soluble organic nitrogen![]() | (A.6) |
: Hydrolysis of entrapped organics![]() | (A.7) |
: Hydrolysis of entrapped organic nitrogen![]() | (A.8) |
result from combinations of basic processes (A.1) to (A.8) as follows:![]() | (A.9) |
![]() | (A.10) |
![]() | (A.11) |
![]() | (A.12) |
![]() | (A.13) |
![]() | (A.14) |
![]() | (A.15) |
![]() | (A.16) |
![]() | (A.17) |
![]() | (A.18) |
![]() | (A.19) |
![]() | (A.20) |
![]() | (A.21) |
|
. In Unit 5, the dissolved oxygen (DO) concentration is controlled at a level of
by manipulation of the
. Each of the five compartments has a flow rate
, the concentration
, and the reaction rate
; where
is the number of compartments. The volume of the non-aerated compartments is
each while the volume of the aerated compartments is
.The general equation for the reactor mass balances is given as:For
(Unit 1)![]() | (B.1) |
![]() | (B.2) |
to
(Unit 2, Unit3, Unit 4 and Unit 5)![]() | (B.3) |
![]() | (B.4) |
:![]() | (B.5) |
. Also, ![]() | (B.6) |
![]() | (B.7) |
![]() | (B.8) |
![]() | (B.9) |