American Journal of Intelligent Systems
p-ISSN: 2165-8978 e-ISSN: 2165-8994
2017; 7(1): 1-18
doi:10.5923/j.ajis.20170701.01

Vincent A. Akpan1, Michael T. Babalola2, Reginald A. O. Osakwe3
1Department of Physics Electronics, The Federal University of Technology, Akure, Nigeria
2Department of Physics Electronics, Afe Babalola University, Ado-Ekiti, Nigeria
3Department of Physics, The Federal University of Petroleum Resources, Effurun, Nigeria
Correspondence to: Vincent A. Akpan, Department of Physics Electronics, The Federal University of Technology, Akure, Nigeria.
| Email: | ![]() |
Copyright © 2017 Scientific & Academic Publishing. All Rights Reserved.
This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

The dynamic modeling of an electromechanical motor system (EMS) for different input voltages based on different weights and the corresponding output revolutions per minute using neural networks (NN) is presented in this paper with a view to quantify the effects of voltages based on different weights on the system output. The input–output data i.e. the electrical input voltage and the revolution per minute (rpm) of a PORCH PPWPM432 permanent magnet direct current (PMDC) motor as the output which is obtained from the EMS have been used for the development of a dynamic model of the EMS. This paper presents the formulation and application of an online modified Levenberg-Marquardt algorithm (MLMA) for the nonlinear model identification of the EMS. The performance of the proposed MLMA algorithm is compared with the so-called error back-propagation with momentum (EBPM) algorithm which is the modified version of the standard back-propagation algorithm for training NNs. The MLMA and the EBPM algorithms are validated by one-step and five-step ahead prediction methods. The performances of the two algorithms are assessed by using the Akaike’s method to estimate the final prediction error (AFPE) of the regularized criterion. The validation results show the superior performance of the proposed MLMA algorithm in terms of much smaller prediction errors when compared to the EBPM algorithm. Furthermore, the simulation results shows that the proposed techniques and algorithms can be adapted and deployed for modeling the dynamics of the EMS and the prediction of future behaviour of the EMS in real life scenarios. In addition, the dynamic modeling of the EMS in closed-loop with a discrete-time fixed parameter proportional-integral-derivative (PID) controller has been conducted using both networks trained with EBPM and the MLMA algorithms. The simulation results demonstrate the efficiency and reliability of the proposed dynamic modeling using MLMA and closed-loop PID control scheme. However, despite the little poor performance of the PID controller, the accuracy of the NN model trained with the MLMA when used in a dynamic operating environment has been confirmed.
Keywords: Artificial neural network (ANN), Dynamic modeling, Electromechanical motor systems (EMS), Error back-propagation with momentum (EBPM), Modified Levenberg-Marquardt algorithm (MLMA), Neural network nonlinear autoregressive moving average with exogenous inputs (NNARMAX), Nonlinear model identification, Proportional-integral-derivative (PID) control
Cite this paper: Vincent A. Akpan, Michael T. Babalola, Reginald A. O. Osakwe, Neural Network-Based Adaptive Speed Controller Design for Electromechanical Systems (Part 2: Dynamic Modeling Using MLMA & Closed-Loop Simulations), American Journal of Intelligent Systems, Vol. 7 No. 1, 2017, pp. 1-18. doi: 10.5923/j.ajis.20170701.01.
![]() | Figure 1. The designed electromechanical motor system: (a) Schematic drawing of the electromechanical system and (b) 3-D drawing of the electromechanical speed control system |
|
![]() | Figure 2. The picture of the completely designed and constructed electromechanical motor system |
|
with disturbance
can be represented by the following Nonlinear AutoRegressive Moving Average with eXogenous inputs (NARMAX) model:![]() | (1) |
is a nonlinear function of its arguments, and
are the past output vector,
are the past input vector,
are the past noise vector,
is the current output,
are the number of past values of the system outputs, system inputs and noise inputs respectively that defines the order of the system, and
is the time delay. The predictor form of (1) based on the information up to time
can be expressed in the following compact form as [15]:![]() | (2) |
is the regression (state) vector,
is an unknown parameter vector which must be selected such that
,
is the error between (1) and (2) defined as![]() | (3) |
in
of (2) is henceforth omitted for notational convenience. Not that
is the same order and dimension as
.Now, let
be a set of parameter vectors which contain a set of vectors such that:![]() | (4) |
is some subset of
where the search for
is carried out;
is the dimension of
;
is the desired vector which minimizes the error in (3) and is contained in the set of vectors
;
are distinct values of
; and
is the number of iterations required to determine the
from the vectors in
.Let a set of
input-output data pair obtained from prior system operation over NT period of time be defined:![]() | (5) |
is the sampling time of the system outputs. Then, the minimization of (3) can be stated as follows:![]() | (6) |
is formulated as a total square error (TSE) type cost function which can be stated as:![]() | (7) |
as an argument in
is to account for the desired model
dependency on
. Thus, given as initial random value of
,
,
and (5), the system identification problem reduces to the minimization of (6) to obtain
. For notational convenience,
shall henceforth be used instead of
.
as the desired model of network and having the DFNN architecture shown in Fig. 3. The proposed NN model identification scheme based on the teacher-forcing method is illustrated in Fig. 4. Note that the “Neural Network Model” shown in Fig. 4 is actually the DFNN shown in Fig. 3 via tapped delay lines (TDL). The inputs to NN of Fig. 4 are 
and
which are concatenated into
or simply
as shown in Fig. 3. The output of the NN model of Fig. 4 in terms of the network parameters of Fig. 3 is given as:![]() | (8) |
and
are the number of hidden neurons and number of regressors respectively;
is the number of outputs,
and
are the hidden and output weights respectively;
and
are the hidden and output biases;
is a linear activation function for the output layer and
is an hyperbolic tangent activation function for the hidden layer defined here as:![]() | (9) |
is a collection of all network weights and biases in (8) in terms of the matrices
and
. Equation (8) is here referred to as NN NARMAX (NNARMAX) model predictor for simplicity.![]() | Figure 3. Architecture of the dynamic feedforward NN (DFNN) model |
![]() | Figure 4. NNARMAX model identification based on the teacher-forcing method |
in (1) is unknown but is estimated here as a covariance noise matrix,
Using
, Equation (7) can be rewritten as [3], [15], [27]:![]() | (10) |
is a penalty norm and also removes ill-conditioning, where
is an identity matrix,
and
are the weight decay parameters for the input-to-hidden and hidden-to-output layers respectively. Note that both
and
are adjusted simultaneously during network training with
and are used to update
iteratively. The algorithm for estimating the covariance noise matrix and updating
is summarized in Table 3. Note that this algorithm is implemented at each sampling instant until
has reduced significantly as in step 7).
|
![]() | (11) |
![]() | (12) |
denotes the value of
at the current iterate
,
is the search direction,
and
are the Jacobian (or gradient matrix) and the Gauss-Newton Hessian matrices evaluated at
.As mentioned earlier, due to the model
dependency on the regression vector
, the NNARMAX model predictor depends on a posteriori error estimate using the feedback as shown in Fig. 4. Suppose that the derivative of the network outputs with respect to
evaluated at
is given as [15]:![]() | (13) |
![]() | (14) |
![]() | (15) |
, then (15) can be reduced to the following form [15]![]() | (16) |
which depends on the prediction errors based on the predicted outputs. Equation (16) is the only component that actually impedes the implementation of the NN training algorithms depending on its computation.Due to the feedback signals, the NNARMAX model predictor may be unstable if the system to be identified is not stable since the roots of (16) may, in general, not lie within the unit circle. The approach proposed here to iteratively ensure that the predictor becomes stable is summarized in the algorithm of Table 4. Thus, this algorithm ensures that roots of
lies within the unit circle before the weights are updated by the training algorithm proposed in the next sub-section.
|
to the diagonal of
with a new iterative updating rule as follows:![]() | (17) |
![]() | (18) |
and
are:![]() | (19) |
![]() | (20) |
the derivative of the network outputs with respect to
evaluated at
and is computed according to (16).The parameter
characterizes a hybrid of searching directions and has several effects [7, 29-32]: 1) for large values of
(18) becomes steepest descent algorithm (with step
) which requires a descend search method; and 2) for small values of
(18) reduces to the Gauss-Newton method and
may become non-positive definite matrix.Despite the fact that (10) is a weighted criterion, the convergence of the Levenberg-Marquardt algorithm (LMA) may be slow since
contains many parameters of different magnitudes, especially if these magnitudes are large as in most cases [8, 10, 28]. This is the major reason for not using the LMA in online training of the NNs.This problem can be alleviated by adding a scaling matrix
(where
is the scaling parameter and
is an identity matrix) which is adjusted simultaneously with
and instead of checking
in (18) for positive definiteness, the check is expressed as![]() | (21) |
is chosen. Different from other methods [28, 32-34] the method proposed here uses the Cholesky factorization algorithm and then iteratively selects
to guarantee positive definiteness of (21) for online application. First, (21) is computed and the check is performed. If (21) is positive definite, the algorithm is terminated, otherwise
is increased iteratively until this is achieved. The method is summarized in Table 5. The key parameter in the algorithm is
. Next, the Cholesky factor
given by (T.2) in Table 5 is reused to compute the search direction from (18) in two-stage forward and backward substitution given respectively as:![]() | (22) |
![]() | (23) |
|
is too far from the optimum value
. Thus, the LMA is sometimes combined with the trust region method [35] so that the search for
is constrained around a trusted region
. The problem can be defined as [3, 15]:![]() | (24) |
![]() | (25) |
is the second-order Gauss-Newton approximate of (10) which can be expressed as:![]() | (26) |
. Thus, with this combined method and using the result from (23), Equation (17) can be rewritten as![]() | (27) |
and
has led to the coding of several algorithms [7, 28-34]. In stead of adjusting
directly, this paper develops on the indirect approach proposed in [35] but reuses
computed in Table 5 to update the weighted criterion (10). Here,
is adjusted according to the ratio
between the actual reduction of (10) and theoretical predicted decrease of (10) using (26). The ratio can be defined as:![]() | (28) |
in (10) for convenience and
is the Gauss-Newton estimate of (10) using (26).The complete modified Levenberg-Marquardt algorithm (MLMA) for updating
is summarized in Table 6. Note that after
is obtained using the algorithm of Table 6, the algorithm of Table 3 is implemented until the conditions set out in Step 7) of the algorithm are satisfied.
|
portion of (3) using the trained network with
and taking the expectation
with respect to
and
leads to the following AFPE estimate [3], [15], [27]:![]() | (29) |
and
is the trace of its arguments and it is computed as the sum of the diagonal elements of its arguments,
and
is a positive quantity that improves the accuracy of the estimate and can be computed according to the following expression:
The third method is the K-step ahead predictions [10] where the outputs of the trained network are compared to the unscaled output training data. The K-step ahead predictor follows directly from (8) and for
and
, takes the following form:![]() | (30) |

The mean value of the K-step ahead prediction error (MVPE) between the predicted output and the actual training data set is computed as follows:![]() | (31) |
corresponds to the unscaled output training data and
the K-step ahead predictor output.![]() | (32) |
![]() | (33) |
for the NNARMAX model predictor discussed in Section 3 and defined here as follows:![]() | (34) |
![]() | (35) |
![]() | (36) |
![]() | (37) |
![]() | (38) |
, on the unit k receiving the input and the output of the unit
sending the signal along the connection as follows:![]() | (39) |
![]() | (40) |
defined in (9), the output
can be expressed as:![]() | (41) |
![]() | (42) |
![]() | (43) |
![]() | (44) |
expressed as:![]() | (45) |
in (39) and (45) is chosen as large as possible without leading to oscillation. To avoid oscillation at large
, the change in weight is made to be dependent on past weight change by adding a momentum term as follows:![]() | (46) |
indexes the presentation number and
is a constant which determines the effects of the previous weight change. When no momentum term is used, it can take a long time before the minimum is reached with a low learning rate, whereas for high learning rates the minimum is never reached because of the oscillations. When a momentum term is added, the minimum is reached faster [38-40].![]() | (47) |
and
,
are the mean and standard deviation of the input and output training data pair; and
and
are the scaled inputs and outputs respectively. Also, after the network training, the joint weights are rescaled according to the expression![]() | (48) |
and
shall be used in the discussion of results.
defined by (37). The input
, that is the initial error estimates
given by (3), is not known in advance and it is initialized to small positive random matrix of dimension
by
. The outputs of the NN are the predicted values of
given by (8).For assessing the convergence performance, the network was trained for
= 20 epochs (number of iterations) with the following selected parameters:
,
,
,
,
,
(NNARMAX),
,
,
and
. The details of these parameters are discussed in Section 3; where
and
are the number of inputs and outputs of the system,
and
are the orders of the regressors in terms of the past values,
is the total number of regressors (that is, the total number of inputs to the neural network),
and
are the number of hidden and output layers neurons, and
and
are the hidden and output layers weight decay terms. The two design parameters
and
were selected to initialize the MLMA algorithm. The maximum number of times the algorithm of Table 3 is implemented is 6 in all the simulations; that is
. For the EBPM, the two design parameters are selected as
and
.The 381 training data is first scaled using equation (47) and the network is trained for τ = 20 epochs using the proposed MLMA and the EBPM algorithm proposed in Sections 3.3 and 4.2.3 respectively.
|
![]() | Figure 5. One-step ahead output prediction of scaled training data |
![]() | Figure 6. One-step ahead output prediction of unscaled validation (test) data |
![]() | Figure 7. Five-step ahead output prediction of unscaled training data |
.Disturbances are variables that fluctuate and cause the process outputs to move from the desired operating values (set-points or desired trajectories). The prescribed desired speed trajectory specified for the EMS is 60 rpm which must be maintained irrespective of the applied weight(s). A disturbance could be a change in flow or temperature of the surroundings or pressure etc. Disturbance variables can normally be further classified in terms of measured or unmeasured signals. The different weights (in kg) applied in this research serves as the disturbances introduced randomly to the EMS and it ranges from 0.5 kg to 35 kg.![]() | Figure 8. The Discrete-time PID control scheme |
is:![]() | (49) |
and
are the proportional, integral and derivative gains respectively,
is the sampling time and
is the error between the desired reference
and predicted output
and N is the number of samples. The minimum and maximum constraints imposed on the PID controller to penalize changes on the EMS control inputs
and outputs
are given as:![]() | (50) |
,
and
for Si_out ( in rpm). The constraints imposed on the EMS defined in (50) are summarized in Table 8 together with the initial control inputs and outputs.
|
![]() | Figure 9. Closed-loop PID control performance of the EMS using NN model trained with EBPM and MLMA algorithms: (a) output speed predictions and (b) control signals |