American Journal of Intelligent Systems
p-ISSN: 2165-8978 e-ISSN: 2165-8994
2014; 4(3): 77-106
doi:10.5923/j.ajis.20140403.02
Vincent A. Akpan1, Reginald A. O. Osakwe2
1Department of Physics Electronics, The Federal University of Technology, P.M.B. 704, Akure, Ondo State, Nigeria
2Department of Physics, The Federal University of Petroleum Resources, P.M.B. 1221, Effurun, Delta State, Nigeria
Correspondence to: Vincent A. Akpan, Department of Physics Electronics, The Federal University of Technology, P.M.B. 704, Akure, Ondo State, Nigeria.
| Email: | ![]() |
Copyright © 2014 Scientific & Academic Publishing. All Rights Reserved.
This paper presents the formulation and application of an online adaptive recursive least squares (ARLS) algorithm to the nonlinear model identification of the secondary settler/clarifier and the effluent tank of an activated sludge wastewater treatment (AS-WWTP). The performance of the proposed ARLS algorithm is compared with the so-called incremental backpropagation (INCBP) which is also an online identification. These algorithms are validated by one-step and five-step ahead prediction methods. The performance of the two algorithms is also assessed by using the Akaike’s method to estimate the final prediction error (AFPE) of the regularized criterion. The validation results show the superior performance of the proposed ARLS algorithm in terms of much smaller prediction errors when compared to the INCBP algorithm.
Keywords: Activated sludge wastewater treatment plant (AS-WWTP), Adaptive recursive least squares (ARLS), Artificial neural network (ANN), Benchmark simulation model No. 1 (BSM #1), Biological reactors, Effluent tank, incremental backpropagation (INCBP), Nonlinear model identification, Nonlinear neural network autoregressive moving average with exogenous input (NNARMAX) model, Secondary settler and clarifier
Cite this paper: Vincent A. Akpan, Reginald A. O. Osakwe, Multivariable NNARMAX Model Identification of an AS-WWTP Using ARLS: Part 2 – Dynamic Modeling of the Secondary Settler and Clarifier, American Journal of Intelligent Systems, Vol. 4 No. 3, 2014, pp. 77-106. doi: 10.5923/j.ajis.20140403.02.
and particulate components
. The nomenclatures and parameter definitions used for describing the AS-WWTP in this work are given in Table 1. Moreover, four fundamental processes are considered: the growth and decay of biomass (heterotrophic and autotrophic), ammonification of organic nitrogen and the hydrolysis of particulate organics.![]() | Table 1. The AS-WWTP Nomenclatures and Parameter Definitions |
![]() | Figure 1. The schematic of the AS-WWTP process |
with disturbance
can be represented by the following Nonlinear AutoRegressive Moving Average with eXogenous inputs (NARMAX) model [24, 25]:![]() | (1) |
is a nonlinear function of its arguments, and
are the past input vector,
are the past output vector,
is the current output,
and
are the number of past inputs and outputs respectively that define the order of the system, and
is time delay. The predictor form of (1) based on the information up to time
can be expressed in the following compact form as [24, 25]:![]() | (2) |
is the regression (state) vector,
is an unknown parameter vector which must be selected such that
,
is the error between (1) and (2) defined as![]() | (3) |
in
of (2) is henceforth omitted for notational convenience. Not that
is the same order and dimension as
.Now, let
be a set of parameter vectors which contain a set of vectors such that:![]() | (4) |
is some subset of
where the search for
is carried out;
is the dimension of
;
is the desired vector which minimizes the error in (3) and is contained in the set of vectors
;
are distinct values of
; and
is the number of iterations required to determine the
from the vectors in
.Let a set of N input-output data pair obtained from prior system operation over NT period of time be defined:![]() | (5) |
![]() | (6) |
is formulated as a total square error (TSE) type cost function which can be stated as:![]() | (7) |
as an argument in
is to account for the desired model
dependency on
. Thus, given as initial random value of
,
,
and (5), the system identification problem reduces to the minimization of (6) to obtain
. For notational convenience,
shall henceforth be used instead of
.![]() | Figure 2. Architecture of the dynamic feedforward NN (DFNN) model |
![]() | Figure 3. NN model identification based on the teacher-forcing method |
as the desired model of network and having the DFNN architecture shown in Fig. 2. The proposed NN model identification scheme based on the teacher-forcing method is illustrated in Fig. 3. Note that the “Neural Network Model” shown in Fig. 3 is the DFNN shown in Fig. 2. The inputs to NN of Fig. 3 are
,
and
which are concatenated into
as shown in Fig. 2. The output of the NN model of Fig. 3 in terms of the network parameters of Fig. 2 is given as:![]() | (8) |
and
are the number of hidden neurons and number of regressors respectively;
is the number of outputs,
and
are the hidden and output weights respectively;
and
are the hidden and output biases;
is a linear activation function for the output layer and
is an hyperbolic tangent activation function for the hidden layer defined here as:![]() | (9) |
is a collection of all network weights and biases in (8) in term of the matrices
and
. Equation (8) is here referred to as NN NARMAX (NNARMAX) model predictor for simplicity.Note that
in (1) is unknown but is estimated here as a covariance noise matrix,
Using
, Equation (7) can be rewritten as:![]() | (10) |
is a penalty norm and also removes ill-conditioning, where
is an identity matrix,
and
are the weight decay values for the input-to-hidden and hidden-to-output layers respectively. Note that both
and
are adjusted simultaneously during network training with
and are used to update
iteratively. The algorithm for estimating the covariance noise matrix and updating
is summarized in Table 2. Note that this algorithm is implemented at each sampling instant until
has reduced significantly as in step 7).
|
|
![]() | (11) |
![]() | (12) |
denotes the value of
at the current iterate
is the search direction,
and
are the Jacobian (or gradient matrix) and the Gauss-Newton Hessian matrices evaluated at
.As mentioned earlier, due to the model
dependency on the regression vector
, the NNARMAX model predictor depends on a posteriori error estimate using the feedback as shown in Fig. 2. Suppose that the derivative of the network outputs with respect to
evaluated at
is given as![]() | (13) |
![]() | (14) |
![]() | (15) |
, then (15) can be reduced to the following form![]() | (16) |
which depends on the prediction error based on the predicted output. Equation (16) is the only component that actually impedes the implementation of the NN training algorithms depending on its computation.Due to the feedback signals, the NNARMAX model predictor may be unstable if the system to be identified is not stable since the roots of (16) may, in general, not lie within the unit circle. The approach proposed here to iteratively ensure that the predictor becomes stable is summarized in the algorithm of Table 3. Thus, this algorithm ensures that roots of
lies within the unit circle before the weights are updated by the training algorithm proposed in the next sub-section.
progressively in a first-in first-out fashion into a sliding window, 2)
is updated after a complete sweep through
, and 3) all
is repeated
times. Thus, Equation (10) can be expressed as [23], [30]:![]() | (17) |
is the exponential forgetting and resetting parameter for discarding old information as new data is acquired online and progressively added to the set
.Assuming that
minimized (17) at time
; then using (17), the updating rule for the proposed ARLS algorithm can be expressed from (11) as:![]() | (18) |
and
given respectively as:
![]() | (19) |
is computed according to (16).In order to avoid the inversion of
, Equation (19) is first computed as a covariance matrix estimate,
, as ![]() | (20) |
By setting
,
and
, Equation (20) can also be expressed equivalently as![]() | (21) |
is the adaptation factor given by
and I is an identity matrix of appropriate dimension,
and
are four design parameters are selected such that the following conditions are satisfied [22], [23], 30]:![]() | (22) |
in
adjusts the gain of the (21),
is a small constant that is inversely related to the maximum eigenvalue of P(k),
is the exponential forgetting factor which is selected such that
and
is a small constant which is related to the minimum
and maximum
eigenvalues of (21) given respectively as [22], [23], [30], [31]:![]() | (23) |
and
in (22) is selected such that
while the initial value of
, that is
, is selected such that
[27].Thus, the ARLS algorithm updates based on the exponential forgetting and resetting method is given from (18) as![]() | (24) |
. Note that after
has been obtained, the algorithm of Table 2 is implemented the conditions in Step 7) of the Table 2 algorithm is satisfied.
portion of (3) using the trained network with
and taking the expectation
with respect to
and
leads to the following AFPE estimate [23, 30, 31]:![]() | (25) |
and
is the trace of its arguments and it is computed as the sum of the diagonal elements of its arguments,
and
is a positive quantity that improves the accuracy of the estimate and can be computed according to the following expression:
The third method is the K-step ahead predictions [10] where the outputs of the trained network are compared to the unscaled output training data. The K-step ahead predictor follows directly from (8) and for 
and
, takes the following form:![]() | (26) |


The mean value of the K-step ahead prediction error (MVPE) between the predicted output and the actual training data set is computed as follows:![]() | (27) |
corresponds to the unscaled output training data and
the K-step ahead predictor output.



and for the secondary settler (with SS and subscript s and ps for inputs and outputs respectively) as follows 




. Out of thirteen states, only four states are measurable namely:
,
(readily biodegradable substrate),
(active heterotrophic biomass),
(oxygen) and
(nitrate and nitrite nitrogen). An additional important parameter
and
are used to assess the amount of soluble solids in the effluent tank and secondary settler/clarifier respectively.As highlighted above, the main objective here is on the efficient neural network model identification to obtain a multivariable NNARMAX model equivalent of the secondary settler/clarifier and the effluent tank of the activated sludge wastewater treatment plant (AS-WWTP) with a view in using the obtained model together with those from [22] to design a multivariable intelligent adaptive predictive control for the AS-WWTP process in our future work. Thus, from the discussions so far, the measured inputs that influence the behaviour of the AS-WWTP shown in Fig. 4 and Fig. 5 are:![]() | (28) |
![]() | (29) |
![]() | Figure 5. The neural network model identification scheme for AS-WWTP based on NNARMAX model |
for the NNARMAX models predictors discussed in Section 4 and defined here as Equation (30), (31), (32) and (33) below.The outputs of the neural network for the secondary settler with the clarifier and the effluent tank are the predicted values of the 13 states each together with the amount of total soluble solids (TSS), thus resulting in fourteen states; which together with the outputs of blocks #1 and #2 in Fig. 4 gives 41 outputs to be predicted at each sampling time instant as given in (34) below.![]() | (30) |
![]() | (31) |
![]() | (32) |
![]() | (33) |
![]() | (34) |
affecting the AS-WWTP are incorporated into dry-weather data provided by the COST Action Group, additional sinusoidal disturbances with non-smooth nonlinearities are introduced for closed-loop performances evaluation based on an updated neural network model at each sampling time instants.![]() | Figure 6. Open-loop steady-state benchmark simulation model No.1 (BSM1) with constant influent |
on the left hand side of (20) in Section 4.3.1 under the formulation of the ARLS algorithm; that is: ![]() | (35) |
is the step size and
is an identity matrix of appropriate dimension. Next, the basic back-propagation given from [27] as:![]() | (36) |
and carry out the recursive computation of the gradient given by (36).![]() | (37) |
and
,
are the mean and standard deviation of the input and output training data pair; and
and
are the scaled inputs and outputs respectively. Also, after the network training, the joint weights are rescaled according to the expression![]() | (38) |
and
shall be used.
defined by (33). The input
, that is the initial error estimates
given by (32), is not known in advance and it is initialized to small positive random matrix of dimension
by
. The outputs of the NN are the predicted values of
given by (34).For assessing the convergence performance, the network was trained for
= 100 epochs (number of iterations) with the following selected parameters:
,
,
,
,
,
(NNARMAX),
,
,
and
. The details of these parameters are discussed in Section 3; where
and
are the number of inputs and outputs of the system, 
and
are the orders of the regressors in terms of the past values,
is the total number of regressors (that is, the total number of inputs to the network),
and
are the number of hidden and output layers neurons, and
and
are the hidden and output layers weight decay terms. The four design parameters for adaptive recursive least squares (ARLS) algorithm defined in (22) are selected to be: α=0.5, β=5e-3,
=1e-5 and π=0.99 resulting to γ=0.0101. The initial values for ēmin and ēmax in (23) are equal to 0.0102 and 1.0106e+3 respectively and were evaluated using (23). Thus, the ratio of ēmin/ēmax from (23) is 9.9018e+4 which imply that the parameters are well selected. Also,
is selected to initialize the INCBP algorithm given in (36).The 1345 dry-weather training data is first scaled using equation (37) and the network is trained for
epochs using the proposed adaptive recursive least squares (ARLS) and the incremental back-propagation (INCBP) algorithms proposed in Sections 4.3 and 5.2.3. After network training, the trained network is again rescaled respectively according to (38), so that the resulting network can work or be used with unscaled AS-WWTP data. Although, the convergence curves of the INCBP and the ARLS algorithms for 100 epochs each are not shown but the minimum performance indexes for both algorithms are given in the third rows of Tables 4 and 5 for the secondary settler/clarifier and effluent tank. As one can observe from these Tables, the ARLS has smaller performance index when compared to the INCBP which is an indication of good convergence property of the ARLS at the expense of higher computation time when compared the small computation time used by the INCBP for 100 epochs as evident in the first rows of Tables 4 and 5.The total square error (TSE) discussed in subsection 4.1, for the network trained with the INCBP and the ARLS algorithms are given in the second rows of Tables 4 and 5. Again, the ARLS algorithm also has smaller mean square errors and minimum performance indices when compared to the INCBP algorithm. The small values of the total square error (TSE) and the minimum performance indices indicate that ARLS performs better than the INCBP for the same number of iterations (epochs). These small errors suggest that the ARLS model approximates better the secondary settler/clarifier and effluent tank of the AS-WWTP process giving smaller errors than the INCBP model.![]() | Table 4(a). Secondary settler and clarifier with the flow rates parameters predictions |
![]() | Table 4(b). Secondary settler and clarifier with the flow rates parameters predictions |
![]() | Table 4(c). Secondary settler and clarifier with the flow rates parameters predictions |
![]() | Table 4(d). Secondary settler and clarifier with the flow rates parameters predictions |
![]() | Table 5(a). Effluent tank and the effluent constrained parameters predictions |
![]() | Table 5(b). Effluent tank and the effluent constrained parameters predictions |
![]() | Figure 7(a). One-step ahead prediction of scaled QF2, QW, QR2 and QF3 training data |
![]() | Figure 7(b). One-step ahead prediction of scaled QE_ps, EQ_ps, Ntotal_ps and TSS_ps training data |
![]() | Figure 7(c). One-step ahead prediction of scaled COD_ps, BOD_ps, SNO_ps and SNH_ps training data |
![]() | Figure 7(d). One-step ahead prediction of scaled TSS_pe, BOD_pe, COD_pe and Ntotal_pe training data |
![]() | Figure 7(e). One-step ahead prediction of scaled EQ_pe, SNO_pe and SNH_pe training data |
![]() | Figure 8(a). One-step ahead prediction of unscaled QF2, QW, QR2 and QF3 validation data |
![]() | Figure 8(b). One-step ahead prediction of unscaled QE_ps, EQ_ps, Ntotal_ps and TSS_ps validation data |
![]() | Figure 8(c). One-step ahead prediction of unscaled COD_ps, BOD_ps, SNO_ps and SNH_ps validation data |
![]() | Figure 8(d). One-step ahead prediction of unscaled TSS_pe, BOD_pe, COD_pe and Ntotal_pe validation data |
![]() | Figure 8(e). One-step ahead prediction of unscaled EQ_pe, SNO_pe and SNH_pe validation data |
![]() | Figure 9(a). Five-step ahead prediction of unscaled QF2, QW, QR2 and QF3 training data |
![]() | Figure 9(b). Five-step ahead prediction of unscaled QE_ps, EQ_ps, Ntotal_ps and TSS_ps training data |
![]() | Figure 9(c). Five-step ahead prediction of unscaled COD_ps, BOD_ps, SNO_ps and SNH_ps training data |
![]() | Figure 9(d). Five-step ahead prediction of unscaled TSS_pe, BOD_pe, COD_pe and Ntotal_pe training data |
![]() | Figure 9(e). Five-step ahead prediction of unscaled EQ_pe, SNO_pe and SNH_pe training data |

|
. The height
of each layer
is
, for a total height of
. Therefore, the settler has total volume equal to
.The solid flux due to gravity is![]() | (B.1) |
is the total sludge concentration and
is a double-exponential settling velocity function defined as:![]() | (B.2) |
. The parameter values for the non-exponential settling velocity function (B.2) are given in Table B.1. Thus, the mass balances for the sludge are expressed as:For the feed layer
:![]() | (B.3) |
to
:![]() | (B.4) |
:![]() | (B.5) |
to
:![]() | (B.6) |
:![]() | (B.7) |
.For the soluble components (including dissolved oxygen), each layer represents a completely mixed volume and the concentrations of soluble components are given accordingly as:For the feed layer
:![]() | (B.8) |
to
:![]() | (B.9) |
to
:![]() | (B.10) |
![]() | (B.11) |
, that is,
.The sludge concentration from the concentrations in Unit 5 of Fig .1 can be computed from:![]() | (B.12) |
. The same calculation is applied for
in the settler underflow and
at the plant exit.To calculate the distribution of particulate concentrations in the recycled and waste flows, their ratios with respect to the total solid concentration are assumed to remain constant across the settler:![]() | (B.13) |
,
,
,
and
. The assumption made here means that the dynamics of the fractions of particulate concentrations in the inlet of the settler will be directly propagated to the settler underflow, without taking into account the normal retention time in the settler [1-4].Appendix B.2: Sludge AgeAppendix B.2.1: Sludge Age Based on Total Amount of BiomassIn the steady-state case, the sludge age calculation is based on the total amount of biomass present in the system that is in the reactor and settler:![]() | (B.14) |
is the total amount of biomass present in the reactor and it is expressed as:![]() | (B.15) |
is the total amount of biomass present in the effluent and it is expressed as:![]() | (B.16) |
is the loss rate of biomass in the effluent and it is expressed as:![]() | (B.17) |
is the loss rate of biomass in the waste flow and it is expressed as:![]() | (B.18) |
![]() | (B.19) |
is the total amount of solids present in the reactor and can be expressed as:![]() | (B.20) |
![]() | (B.21) |
is the total amount of solids present in the settler and can be expressed as:![]() | (B.22) |
![]() | (B.23) |
is the loss rate of solids in the effluent and can be expressed as:![]() | (B.24) |
![]() | (B.25) |
is the loss rate of solids in the waste flow and can be expressed as:![]() | (B.26) |

![]() | (C.1) |



Appendix C.2: Effluent Quality and ConstraintsAppendix C.2.1: Effluent Quality (EQ)The effluent quality (EQ), in kg pollution unit/d, is averaged over the period of observation
(i.e. the second week or 7 last days for each weather file) based on a weighting of the effluent loads of compounds that have a major influence on the quality of the receiving water and that are usually included in regional legislation. It is defined as [2]–[4]:![]() | (C.2) |




where
are weighting factors for the different types of pollution to convert them into pollution units
and were chosen to reflect these calculated fractions as follows:
and
The major operating cost in biological nutrient removal process as well as nitrogen removing ASPs is blower energy consumption. If the DO set-point is reduced by a control strategy, significant energy saving can be achieved. Operational issues are considered through three items: sludge production, pumping energy and aeration energy (integrations performed on the final 7 days of weather simulations (i.e. from day 22 to day 28 of weather file simulations,
).Appendix C.2.2: Constraints on the Effluent QualityThe flow average values of the effluent concentrations over the three test periods (dry, rain and storm weather: 7 days for each) should be constrained for the five effluent components within the following limit: total nitrogen
, total COD
ammonia
suspended solids
and
Appendix C.3: The Sludge production to be disposed (
)This is the sludge production,
is calculated from the total solid flow from wastage and solid accumulated in the system over the period of time considered (
for each weather file). The amount of solids in the system at time t is given by: ![]() | (C.3) |
is the amount of solids in the reactor given by:![]() | (C.4) |
is the amount of solids in the settler given by:![]() | (C.5) |
the change in system sludge mass from the end of day 7 to the end of day 14 given by:
and
the amount of waste sludge is given by:![]() | (C.6) |
![]() | (C.7) |
)The total sludge production takes into account the sludge to be disposed and the sludge lost to the weir and is calculated as follows:![]() | (C.8) |
Appendix C.5: The Pumping Energy (PE)The pumping energy in
is calculated as follows:![]() | (C.9) |
is internal recycle flow rate at time
is the return sludge recycle flow rate at time
and
is the waste sludge flow rare at time
Appendix C.6: The Aeration Energy (AE)The aeration energy (AE) in
takes into account the plant peculiarities (type of diffuser, bubble size, depth of submersion, etc,) and is calculated from the
in the three aerated tanks according to the following relation, valid for Degrémont DP230 porous disks at an immersion depth of 4m:![]() | (C.10) |
is in
and where i is the compartment number.The increase in capacity which could be obtained using the proposed control strategy should be evaluated. This factor is relative to investment costs if the plant would simply be extended to deal with increased load. This is expressed by the relative increase in the influent flow rate,
which can be applied while maintaining the reference effluent quality index
for the three weather conditions (
days for each).
is calculated from the above equation in open loop.
with
for dry weather,
for storm weather and
for rain weather. Operation variables such as
and
in compartments 3 and 4 remains unchanged.