American Journal of Computational and Applied Mathematics

p-ISSN: 2165-8935    e-ISSN: 2165-8943

2019;  9(2): 26-42

doi:10.5923/j.ajcam.20190902.02

 

Estimation of Noisy Logistic Dynamical Map by Using Neural Network with FFT Transfer Function

Salah H. Abid, Saad S. Mahmood, Yaseen A. Oraibi

Department of Mathematics, College of Education, AL-Mustansiriyah University, Baghdad, Iraq

Correspondence to: Salah H. Abid, Department of Mathematics, College of Education, AL-Mustansiriyah University, Baghdad, Iraq.

Email:

Copyright © 2019 The Author(s). Published by Scientific & Academic Publishing.

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

The aim of this paper is to design a feed forward artificial neural network (Ann) to estimate one dimensional noisy Logistic dynamical map by selecting an appropriate network, transfer function and node weights to get noisy Logistic dynamical map estimation. The proposed network side by side with using Fast Fourier Transform (FFT) as transfer function is used. For different cases of the system, noisy deterministic, noisy chaotic and noise high chaotic, the experimental results of proposed algorithm will compared empirically, by means of the mean square error (MSE) with the results of the same network but with traditional transfer functions, Logsig and Tagsig. The performance of proposed algorithm is best from others in all cases from Both sides, speed and accuracy.

Keywords: FFT, Logsig, Tagsig, Feed Forward neural network, Transfer function, Noisy Logistic map, Uniform noise, Noise normal, Logistic noise

Cite this paper: Salah H. Abid, Saad S. Mahmood, Yaseen A. Oraibi, Estimation of Noisy Logistic Dynamical Map by Using Neural Network with FFT Transfer Function, American Journal of Computational and Applied Mathematics , Vol. 9 No. 2, 2019, pp. 26-42. doi: 10.5923/j.ajcam.20190902.02.

1. Introduction

Ann is a simplified mathematical model of the human brain, It can be implemented by both electric elements and computer software. It is a parallel distributed processor with large numbers of connections, it is an information processing system that has certain performance characters in common with biological neural networks. Ann has been developed as generalizations of mathematical models of human cognition or neural biology, based on the assumptions that:
1- Information processing occurs at many simple elements called neurons that are fundamental to the operation of Ann's.
2- Signals are passed between neurons over connection links.
3- Each connection link has an associated weight which, in a typical neural net, multiplies the signal transmitted.
4- Each neuron applies an action function (usually nonlinear) to its net input (sum of weighted input signals) to determine its output signal [15].
The units in a network are organized into a given topology by a set of connections, or weights.
Ann is characterized by [30]:
1- Architecture: its pattern of connections between the neurons.
2- Training Algorithm: its method of determining the weights on the connections.
3- Activation function.
Ann are often classified as single layer or multilayer. In determining the number of layers, the input units are not counted as a layer, because they perform no computation. Equivalently, the number of layers in the net can be defined to be the number of layers of weighted interconnects links between the slabs of neurons [46].

1.1. Multilayer Feed Forward Architecture [22]

In a layered neural network the neurons are organized in the form of layers. We have at least two layers: an input and an output layer. The layers between the input and the output layer (if any) are called hidden layers, whose computation nodes are correspondingly called hidden neurons or hidden units. Extra hidden neurons raise the network’s ability to extract higher-order statistics from (input) data.
The Ann is said to be fully connected in the sense that every node in each layer of the network is connected to every other node in the adjacent forward layer; otherwise the network is called partially connected. Each layer consists of a certain number of neurons; each neuron is connected to other neurons of the previous layer through adaptable synaptic weights w and biases b.

1.2. Literature Review

Pan and Duraisamy in 2018 [33] studied the use of feedforward neural networks (FNN) to develop models of non-linear dynamical systems from data. Emphasis is placed on predictions at long times, with limited data availability. Inspired by global stability analysis, and the observation of strong correlation between the local error and the maximal singular value of the Jacobian of the ANN, they introduced Jacobian regularization in the loss function. This regularization suppresses the sensitivity of the prediction to the local error and is shown to improve accuracy and robustness. Comparison between the proposed approach and sparse polynomial regression is presented in numerical examples ranging from simple ODE systems to nonlinear PDE systems including vortex shedding behind a cylinder, and instability-driven buoyant mixing ow. Furthermore, limitations of feedforward neural networks are highlighted, especially when the training data does not include a low dimensional attractor. The need to model dynamical behavior from data is pervasive across science and engineering. Applications are found in diverse fields such as in control systems [43], time series modeling [41], and describing the evolution of coherent structures [12]. While data-driven modeling of dynamical systems can be broadly classified as a special case of system identification [23], it is important to note certain distinguishing qualities: the learning process may be performed off-line, physical systems may involve very high dimensions, and the goal may involve the prediction of long-time behavior from limited training data. Artificial neural networks (ANN) have attracted considerable attention in recent years in domains such as image recognition in computer vision [17, 37] and in control applications [12]. The success of ANNs arises from their ability to effectively learn low-dimensional representations from complex data and in building relationships between features and outputs. Neural networks with a single hidden layer and nonlinear activation function are guaranteed to be able to predict any Borel measurable function to any degree of accuracy on a compact domain [16]. The idea of leveraging neural networks to model dynamical systems has been explored since the 1990s. ANNs are prevalent in the system identification and time series modeling community [19, 28, 29, 35], where the mapping between inputs and outputs is of prime interest. Billings et al. [5] explored connections between neural networks and the nonlinear autoregressive moving average model (NARMAX) with exogenous inputs. It was shown that neural networks with one hidden layer and sigmoid activation function represent an infinite series consisting of polynomials of the input and state units. Elanayar and Shin [13] proposed the approximation of nonlinear stochastic dynamical systems using radial basis feedforward neural networks. Early work using neural networks to forecast multivariate time series of commodity prices [9] demonstrated its ability to model stochastic systems without knowledge of the underlying governing equations. Tsung and Cottrell [44] proposed learning the dynamics in phase space using a feedforward neural network with time-delayed coordinates. Paez and Urbina [31, 32, 45] modeled a nonlinear hardening oscillator using a neural network-based model combined with dimension reduction using canonical variate analysis (CVA). Smaoui [40, 41, 42] pioneered the use of neural networks to predict fluid dynamic systems such as the unstable manifold model for bursting behavior in the 2-D Navier-Stokes and the Kuramoto-Sivashinsky equations. The dimensionality of the original PDE system is reduced by considering a small number of proper orthogonal decomposition (POD) coefficients [4]. Interestingly, similar ideas of using principal component analysis for dimension reduction can be traced back to work in cognitive science by Elman [14]. Elman also showed that knowledge of the intrinsic dimensions of the system can be very helpful in determining the structure of the neural network. However, in the majority of the results [40, 41, 42], the neural network model is only evaluated a few time steps from the training set, which might not be a stringent performance test if longer time predictions are of interest. ANNs have also been applied to chaotic nonlinear systems that are challenging from a data-driven modeling perspective, especially if long time predictions are desired. Instead of minimizing the pointwise prediction error, Bakker et al. [3] satisfied the Diks’criterion in learning the chaotic attractor. Later, Lin et al. [20] demonstrated that even the simplest feedforward neural network for nonlinear chaotic hydrodynamics can show consistency in the time-averaged characteristics, power spectra, and Lyapunov exponent between the measurements and the model. A major difficulty in modeling dynamical systems is the issue of memory. It is known that even for a Markovian system, the corresponding reduced-dimensional system could be non-Markovian [10, 34]. In general, there are two main ways of introducing memory effects in neural networks. First, a simple workaround for feedforward neural networks (FNN) is to introduce time delayed states in the inputs [11]. However, the drawback is that this could potentially lead to an unnecessarily large number of parameters [18]. To mitigate this, Bakker [3] considered following Broomhead and King [6] in reducing the dimension of the delay vector using weighted principal component analysis (PCA). The second approach uses output or hidden units as additional feedback. As an example, Elman’s network [18] is a recurrent neural network (RNN) that incorporates memory in a dynamic fashion. Miyoshi et al. [25] demonstrated that recurrent RBF networks have the ability to reconstruct simple chaotic dynamics. Sato and Nagaya [38] showed that evolutionary algorithms can be used to train recurrent neural networks to capture the Lorenz system. Bailer-Jones et al. [2] used a standard RNN to predict the time derivative in discrete or continuous form for simple dynamical systems; this can be considered an RNN extension to Tsung’s phase space learning [44]. Wang et al. [47] proposed a framework combining POD for dimension reduction and long-short-term memory (LSTM) recurrent neural networks and applied it to a fluid dynamic system.

1.3. Fast Fourier Transform

The first to propose the techniques that we now call the fast Fourier transform (FFT) for calculating the coefficients in a trigonometric expansion of an asteroid’s orbit in 1805 [8]. However, Fast Fourier transform is an algorithm that calculates the value of the discret Fourier transform in faster. the speed this algorithm Is due to the fact that it does not calculate the parts that are equal to zero. the algorithm is discovered by James W. Cooley and John W. Tukey who published the algorithm in 1965 [11]
As know today
(1)
The coefficients can be datermain as follows :
(2)
(3)
(4)
The discrete Fourier transform (DFT) is one of the most powerful tools in Digital signal processing. The DFT enables us to conveniently analyze and Design systems in frequency domain [1] and the formal as:
(5)

2. Logistic Map

In 1845, Pierre Verhulst proposed a logistic map, which is a simple non-linear dynamical map. A logistic map is one of the most popular and simplest chaotic maps. Logistic map became very popular after it was exploited in 1979 by the biologist Robert M. May [24] [21]. The logistic map is a polynomial mapping, a complex chaotic system, the behaviour of which can arise from very simple non-linear dynamical equations. The logistic map equation is written as [26]:
(6)
where xn is a number between zero and one, x0 represents the initial population, and r is a positive number between zero and four.
The logistic map is one of the chaotic maps; it is highly sensitive to change in its parameter value, where a different value of the parameter r will produce a different sketches. Its transformation function is which is defined in the above equation (6). From the onset of chaos, a seemingly random jumble of dots, the behavior of the logistic map depends mainly on the values of two variables (r, x0); by changing one or both variables’ values we can observe different logistic map behavior’s. The population of a logistic map will die out if the value of r is between 0 and 1, and the population will be quickly stabilized on the value (1- r )/r if the value of r is between 1 and 3. Then, the population will oscillate between two values if the value of parameter r is between 3 and 3.45. After that, with values of parameter r between 3.45 and 4 the periodic fluctuation becomes significantly more complicated. Finally, most of the values after 3.57 show chaotic behaviour.
In the logistic map the function result depends on the value of parameter r, where different values of r will give quite different pictures. We can note that g(x1) = x2 and g(x2) = x1, that mean g(g(x2)) = x1 and g(g(x1)) = x2. According to Alligood et al. (1996) the periodic fluctuation between x1, x2 is steady and attracts orbits (trajectories). Therefore, there are a minimum number of iterations of the orbit to repeat the point. There are obvious differences between the behaviour of the exponential model and the logistic model’s behaviour. [24, 21]
Figure 1. Bifurcation diagram of the Logistic map [26]

2.1. Noisy Logistic map (NG) Solution

In this section we will explain how this approach can be used to find the approximate solution of the noisy logistic map that is stated in equation (6).
NG(x) is the solution to be computed. Let yt(x, p) denotes a trial solution with adjustable parameters p.
In the proposed approach, the trial solution yt employs a FFNN and the parameters p corresponding to the weights and biases of the neural architecture. We choose a form for the trial function yt(x) such that yt(x,p) = N(x, p) where N(x, p) is a single-output FFNN with parameters (weights) p and n input units fed with the input vector x.

2.2. Computation of the Gradient

The error corresponding to each input vector xi is the value E (xi) which has to force near zero. Computation of this error value involves not only the FFNN output but also the derivatives of the output with respect to any of its inputs. Therefore, for computation the gradient of the error with respect to the network weights, consider a multilayer FFNN with n input units (where n is the dimensions of the domain), two hidden layer with H sigmoid units,q hidden layer and a linear output unit.
For a given input vector x ( x1, x2, …, xn ) the output of the FFNN is:
and
wij denotes the weight connecting the input unit j to the hidden unit i
denotes the weight connecting the hidden unit i to the hidden unit k
vk denotes the weight connecting the hidden unit k to the out put unit,
bi denotes the bias of hidden unit i,
bik denotes the bias of hidden unit i to the hidden unit k, and
σ is the transfer function
The gradient of suggest FFNN, with respect to the coefficients of the FFNN can be computed as:
(7)
(8)
(9)
(10)
(11)
The derivative of the error performance with respect to the FFNN coefficients can be defined and then it is easy to find minimization solution.

3. Suggested Networks

It is well known that a multilayer FFNN consist one hidden layer can approximate any function to any accuracy [27], but dynamical maps they have more completed behavior than other functions, thus, we suggest FFNN contains two hidden layer, one input and one output to estimate a solution for dynamic maps.
The suggested network divided the inputs in to two parts 60% for training and 40% for testing. The error quantity to be minimized is given by:
(12)
Where xi ∈ [0, 1]. It is easy to evaluate the gradient of the performance with respect to the coefficient. Using (7) – (11). The training progress algorithm of FFNN with supervises training and BFGS algorithm is given as follows. Assume that there are one nodes in the First layer (input layer), the second layer (hidden layer) consist with ten nodes, the third layer (hidden layer) consist with five nodes and the fourth layer (output layer) consist with one node.
Following the steps of the technique as an algorithm,
Step 0: input and target.
Insert the input (x: x1, x2, x3,…., xn) and the target.
Step 1: allocating inputs.
Each input goes to each neurons with a hidden layer.
Step 2: initialization weights.
We initial weights and bias from the uniform distribution respectively for all connection in the neural network.
Step 3: select the following.
- parameter epoch
- parameter goal
- performance function (MSE)
Step 4: calculations in each node in the first hidden layer.
In each node in hidden layer, computing the sum of the product of weights and inputs and adding the result to the bias.
Step 5: compute the output of each node for the first hidden layer.
Take the active function for sum value in step4, then its output is sent to the second hidden layer as input.
Step 6: calculations in each node in the second layer.
In each node in second hidden layer, computing the sum of the product of weights and inputs and adding the result to the bias.
Step 7: compute the output of each node for the second hidden layer.
Take the active function for sum value in step 6, then its output is sent to the output layer as input.
Step 8: calculations in output layer.
There is only one neuron (node) in the output layer. The node sum is the product of weights by inputs.
Step 9: compute the output of node in output layer
The value of active function for node output is also considered as the output of overall network.
Step 10: compute the mean square error (MSE).
The mean square error is computed as follows
the MSE is a measure of performance.
Step 11: The checking.
When such that is small value close to zero, then stop the training and the bias and weights are sent. Otherwise training process goes to the next step.
Step 12: when select the training rule, the low for update weights and bias between the hidden layer and the output layer are calculated
Step 13: the update weights and bias in output layer.
At end for each iteration, the weights and bias are updating as follows:
Figure 2. Flowchart for training algorithm with BFGS
When (new) means the current iteration and (old) means the previous iteration,
represent the gradient for weights and bias, is the parameter selected to minimize the performance function along the search direction, represent the invers hessian matrix, v is the weight in the output layer and b is the bias.
Step 14: the update of weights and bias in the first hidden layer.
Each hidden node in the first hidden layer updates the weights and bias as follow:
Where w is the weight of hidden layer and b is the bias.
Step 15: the update of weights and bias in the second hidden layer as follow.
Where s is the weight of hidden layer and b is the bias.
Step 16: return to step2 for next iteration.

4. Data Description

We know that for each starting point in dynamical map there are a minimum number of iterations of the orbit to repeat a point. So we choose number of initial points and iterated until reaching a point. There are initial points have thousands of iterations until reaching a point a gain, but the ability of personal computer will restrict us to take all these points. In addition, the higher value of the Bifurcation parameter implicit to complicate the calculations. For that reason, we will take the maximum capacity of the computer which is a one hundred point for initial point exceeding 100 iterations distributed on all the domain map evenly and the number of iterations of points was found by the program MATLAB R2010a.

4.1. Training for Noisy Logistic Map

To design FFNN that is not sensitive to noise for this case, we use the propose network with FFT as transfer function to train the data with noise, it is suitable to choose the maximum number of epochs, when the noisy data have noise of (uniform, normal and logistic) distributions and we take for all noises value of variance (0.05, 0.5, 15). We train this case with noise with three types of transfer functions Tansig, logsig and FFT. It is worth to mention that the run size is k=1000.

4.2. Uniform Distribution [36]

A uniform distribution is one have a constant probability. The probability density function and cumulative distribution function for continuous uniform distribution on the interval [a, b] are
(13)
(14)
One can generate values from Uniform random variable as follows,
(15)
Where q is represent the number point and v is represent the variance.

4.3. Normal Distribution [36]

Let x be a normally distributed random variable with mean and variance , where with probability density function and cumulative distribution function are respectively,
(16)
(17)
One can generate values from Normal random variable as follows,
(18)
Where q is represent the number point and v is represent the variance.

4.4. Logistic Distribution [36]

Let x be a Logistic distributed random variable with parameters and , where with probability density function and cumulative distribution function are respectively,
(19)
(20)
One can generate values from Logistic random variable as follows,
(21)
Where q is represent the number point and v is represent the variance.

4.5. Training the Case with Uniform Noise

We train case with uniform noise and variances 0.05, 0.5 and 15. Table from (1) to (9) contain the results.
Table (1). Time and MSE of approximate solution after many training trials with uniform noise by using tansig transfer function when the variance equal to 0.05
Table (2). Time and MSE of approximate solution after many training trials with uniform noise by using logsig transfer function when the variance equal to 0.05
Table (3). Time and MSE of approximate solution after many training trials with uniform noise by using FFT transfer function when the variance equal to 0.05
Table (4). Time and MSE of approximate solution after many training trials with uniform noise by using tansig transfer function when the variance equal to 0.5
Table (5). Time and MSE of approximate solution after many training trials with uniform noise by using logsig transfer function when the variance equal to 0.5
Table (6). Time and MSE of approximate solution after many training trials with uniform noise by using FFT transfer function when the variance equal to 0.5
Table (7). Time and MSE of approximate solution after many training trials with uniform noise by using logsig transfer function when the variance equal to 15
Table (8). Time and MSE of approximate solution after many training trials with uniform noise by using tansig transfer function when the variance equal to 15
Table (9). Time and MSE of approximate solution after many training trials with uniform noise by using FFT transfer function when the variance equal to 15

4.6. Training for Case with Normal Noise

We train case with normal noise and variances 0.05, 0.5 and 15. Table from (10) to (18) contain the results.
Table (10). Time and MSE of approximate solution after many training trials with normal noise by using logsig transfer function when the variance equal to 0.05
Table (11). Time and MSE of approximate solution after many training trials with normal noise by using tansig transfer function when the variance equal to 0.05
Table (12). Time and MSE of approximate solution after many training trials with normal noise by using FFT transfer function when the variance equal to 0.05
Table (13). Time and MSE of approximate solution after many training trials with normal noise by using logsig transfer function when the variance equal to 0.5
Table (14). Time and MSE of approximate solution after many training trials with normal noise by using tansig transfer function when the variance equal to 0.5
Table (15). Time and MSE of approximate solution after many training trials with normal noise by using FFT transfer function when the variance equal to 0.5
Table (16). Time and MSE of approximate solution after many training trials with normal noise by using logsig transfer function when the variance equal to 15
Table (17). Time and MSE of approximate solution after many training trials with normal noise by using tansig transfer function when the variance equal to 15
Table (18). Time and MSE of approximate solution after many training trials with normal noise by using FFT transfer function when the variance equal to 15

4.7. Training for Case with Logistic Noise

We train case with logistic noise and variances 0.05, 0.5 and 15. Table (19) to (27) contains the results.
Table (19). Time and MSE of approximate solution after many training trials with logistic noise by using logsig transfer function when the variance equal to 0.05
Table (20). Time and MSE of approximate solution after many training trials with logistic noise by using tansig transfer function when the variance equal to 0.05
Table (21). Time and MSE of approximate solution after many training trials with logistic noise by using FFT transfer function when the variance equal to 0.05
Table (22). Time and MSE of approximate solution after many training trials with logistic noise by using logsig transfer function when the variance equal to 0.5
Table (23). Time and MSE of approximate solution after many training trials with logistic noise by using tansig transfer function when the variance equal to 0.5
Table (24). Time and MSE of approximate solution after many training trials with logistic noise by using FFT transfer function when the variance equal to 0.5
Table (25). Time and MSE of approximate solution after many training trials with logistic noise by using logsig transfer function when the variance equal to 15
Table (26). Time and MSE of approximate solution after many training trials with logistic noise by using tansig transfer function when the variance equal to 15
Table (27). Time and MSE of approximate solution after many training trials with logistic noise by using FFT transfer function when the variance equal to 15

4.8. Results Discussion

From the above results it is clear that when the noises uniforme, normal and logistic with variances (0.05,0.5) the propose network FFNN side by side transfer function FFT has the best performance when the logistic map (deterministic, chaotic and high chaotic) but the performance function for the tansig and logsig transfer functions when the logistic map in deterministic and chaotic is good performance, in high chaotic the performance function for this transfer functions have a bad performance.
When the noises have variance 15 we trained the data for logistic map in cases deterministic and chaotic. We get the FFT transfer function has the best performance other than the tansig and logsig transfer functions have bad performance.

5. Summary

From the tables of results, it is clear when the noise is uniforme, normal and logistic with all considered variances values, that the FFT transfer function is the best among all other transfer functions in deterministic, chaotic and high chaotic cases.
The tansig and logsig transfer functions have good performance in deterministic and chaotic cases, but bad performance in high chaotic case. The other transfer functions have bad performance to estimate noisy logistic dynamical map.
The performance is measured by MSE criterion as we mentioned earlier.
The estimation of dynamical maps with noise obtained by trained Ann's offer some advantages, such as:
1- Complexity of computations increases with the increase of the number of sampling points in noisy Logistic dynamical map with noise.
2- The FFNNs with FFT transfer function provides a solution with noises (uniform, normal and logistic distributions) with very good performance function in compare with other traditional transfer functions.
3- The proposed FFNNs with FFT transfer function can be applied to one deamination dynamical maps with noise.
4- The proposed transfer function FFT gave best rustles specially when the logistic dynamical map with noises (uniform, normal, logistic distributions).
5- The proposed transfer function FFT is faster other than other traditional transfer functions in most cases of dynamical maps one deamination with noises with noises (uniform, normal and logistic distributions).
6- In general, the experimental results show that the FFNN side by side FFT transfer function which proposed can handle effectively noisy logistic dynamical map and provide accurate approximate solution throughout the whole domain, because three points the first point neural network computations is parallel the second point FFT analysis of data and computations are parallel and third point FFT transfer function returns differences in data to sources original related homogeneity data and sectors of the work.
7- The new transfer function FFT has ability on break up between chaotic and noise in Logistic dynamical map.
Some future works can be recommended. These works is as follows
1- Using networks with three or more hidden layers.
2- Increase the neurons in each hidden layer.
3- Use other random variables as noise.
4- Estimate multidimensional dynamical maps.
5- We recommended using FFT as transfer function in practical application.

References

[1]  Arar, S. (2017) “An Introduction to the Fast Fourier Transform”, https://www.allaboutcircuits.com/technical-articles/anintroduction-to-the-fast-fourier-transform/2017.
[2]  Bailer-Jones, C., MacKay, D. and Withers, P. J. (1998) “A recurrent neural network for modelling dynamical systems,”Network: Computation in Neural Systems, vol. 9, no. 4, pp. 531–547.
[3]  Bakker, R., Schouten, J., Giles, C. Takens, F. and van den Bleek, C. (2000) “Learning chaotic attractors by neural networks”, Neural Computation, vol. 12, no. 10, pp. 2355–2383.
[4]  Berkooz, G., Holmes, P. and Lumley, J. (1993) “The proper orthogonal decomposition in the analysis of turbulent flows,” Annual Review of Fluid Mechanics, vol. 25, no. 1, pp. 539–575.
[5]  Billings, S., Jamaluddin, H. and Chen, S. (1992) “Properties of neural networks with applications to modelling nonlinear dynamical systems,” International Journal of Control, vol. 55, no. 1, pp. 193–224.
[6]  Broomhead, D. and King, G. (1986) “Extracting qualitative dynamics from experimental data,” Physica D: Nonlinear Phenomena, vol. 20, no. 2-3, pp. 217–236.
[7]  Brunton, S. Proctor, J. and Kutz, J. (2016) “Discovering governing equations from data by sparse identification of nonlinear dynamical systems”, Proceedings of the National Academy of Sciences of the United States of America, vol. 113, no. 15, pp. 3932–3937.
[8]  Burrus, C. and Johnson, S. (2012) “Fast Fourier Transforms”, http://cnx.org/content/col10550/1.22.
[9]  Chakraborty, K., Mehrotra, K., Mohan, C. and Ranka, S. (1992) “Forecasting the behavior of multivariate time series using neural networks,” Neural Networks, vol. 5, no. 6, pp. 961–970.
[10]  Chorin, A. and Hald, O. (2009) “Stochastic Tools in Mathematics and Science”, vol. 3 of Surveys and Tutorials in the Applied Mathematical Sciences, Springer.
[11]  Cooly, J. and Tukey, J.W. (1965) “An Algorithm for the Machine Calculation of Complex Fourier Series”, Mathematics of Computation, 19 (90), 297-301.
[12]  Duriez, T., Brunton, S. and Noack, B. (2017) “Machine Learning Control – Taming Nonlinear Dynamics and Turbulence, Springer.
[13]  Elanayar, V. and Shin, Y. (1994) “Radial basis function neural network for approximation and estimation of nonlinear stochastic dynamic systems,” IEEE Transactions on Neural Networks, vol. 5, no. 4, pp. 594–603.
[14]  Elman, J. (1990) “Finding structure in time,” Cognitive Science, vol. 14, no. 2, pp. 179–211.
[15]  Galushkin, I. (2007) "Neural Networks Theory", Berlin Heidelberg.
[16]  Hornik, K. (1991) “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, no. 2, pp. 251–257, 1991.
[17]  Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R. and Fei-Fei, L. (2014) “Largescale video classification with convolutional neural networks,” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725–1732, Columbus, OH, USA.
[18]  Koskela, T. Lehtokangas, M., Saarinen, M. and Kaski, K. (1996) “Time series prediction with multilayer perceptron, FIR and Elman neural networks,” in Proceedings of the World Congress on Neural Networks, pp. 491–496, Citeseer.
[19]  Kuschewski, J., Hui, S. and Zak, S. H. (1993) “Application of feedforward neural networks to dynamical system identification and control,” IEEE Transactions on Control Systems Technology, vol. 1, no. 1, pp. 37–49.
[20]  Lin, H., Chen, W. and Tsutsumi, A. (2003)“Long-term prediction of nonlinear hydrodynamics in bubble columns by using artificial neural networks,” Chemical Engineering and Processing: Process Intensification, vol. 42, no. 8-9, pp. 611–620.
[21]  Lawande Q., B. Ivan, and S. Dhodapkar, (2005) "Chaos Based Cryptography: A new Approach to Secure Communications", BARC newsletter, vol. 258.
[22]  Mahdi, O. and Tawfiq, L. (2015)" Design Suitable Neural Networks to Solve EigenValue Problems and It′s Application", M.Sc. Thesis, College of Education Ibn AL-Haitham, University of Baghdad, Iraq.
[23]  Mangan, N. Brunton, S., Proctor, J. and Kutz, J. (2016) “Inferring biological networks by sparse identification of nonlinear dynamics”, IEEE Transactions on Molecular, Biological and Multi-Scale Communications, vol. 2, no. 1, pp. 52–63.
[24]  Maqableh, M. (2012) "Analysis and Design Security Primitives Based on Chaotic Systems for Ecommerce", Durham University.
[25]  Miyoshi, T., Ichihashi, H., Okamoto, S. and Hayakawa, T. (1995) “Learning chaotic dynamics in recurrent RBF network,” in Proceedings of ICNN'95 - International Conference on Neural Networks, vol. 1, pp. 588–593, Perth, WA, Australia.
[26]  Mohsen M. M. A. (2017), "Multi-level Security by Using Chaotic Dynamical System”, M.Sc. Thesis, Technical College of Management / Baghdad, Middle Technical University, Iraq.
[27]  MacKay, Neural Computation, Vol. 4, No. 3, pp 415-447, 1992.
[28]  Narendra, K., and Parthasarathy, K. (1990) “Identification and control of dynamical systems using neural networks,” IEEE Transactions on Neural Networks, vol. 1, no. 1, pp. 4–27.
[29]  Narendra, K. and Parthasarathy, K. (1992) “Neural networks and dynamical systems,” International Journal of Approximate Reasoning, vol. 6, no. 2, pp. 109–131.
[30]  Oraibi, Y. and Tawfiq, L. (2013) "Fast Training Algorithms for Feed Forward Neural Networks", Ibn Al-Haitham Jour. for Pure & Appl. Sci., Vol. 26, No. 1, pp.276.
[31]  Paez, T. and Hunter, N. (1997) “Dynamical system modeling via signal reduction and neural network simulation,” Sandia National Labs, Albuquerque, NM (United States).
[32]  Paez, T. and Hunter, N. (2000) “Nonlinear system modeling based on experimental data,” Technical report, Sandia National Labs., Albuquerque, NM(US); Sandia National Labs., Livermore, CA (US).
[33]  Pan, S. and Duraisamy, K. (2018) “Long-Time Predictive Modeling of Nonlinear Dynamical Systems Using Neural Networks”, Hindawi, Complexity, Volume 2018, pp. 1–26.
[34]  Parish, E. and Duraisamy, K. (2016) “Reduced order modeling of turbulent flows using statistical coarse-graining,” in 46th AIAA Fluid Dynamics Conference, Washington, D.C., USA.
[35]  Polycarpou, M. and Ioannou, P. (1991) “Identification and control of nonlinear systems using neural network models: design and stability analysis,” University of Southern California.
[36]  Rubinstein, Y. and Kroese, D. (2007), “Simulation And The Monte Carlo Method”,Wiley, second Edition.
[37]  Russakovsky, O., Deng, J., Su, H. (2015) “Imagenet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252.
[38]  Sato, Y. and Nagaya, S. (1996) “Evolutionary algorithms that generate recurrent neural networks for learning chaos dynamics,” in Proceedings of IEEE International Conference on Evolutionary Computation, pp. 144–149, Nagoya, Japan.
[39]  Shumway, R. and Stoffer, D. (2000) “Time series analysis and its applications,” Studies in Informatics and Control, vol. 9, no. 4, pp. 375-376.
[40]  Smaoui, N. (1997) “Artificial neural network-based low-dimensional model for spatio-temporally varying cellular flames,” Applied Mathematical Modelling, vol. 21, no. 12, pp. 739–748.
[41]  Smaoui, N. (2001) “A model for the unstable manifold of the bursting behavior in the 2D navier–stokes flow,” SIAM Journal on Scientific Computing, vol. 23, no. 3, pp. 824–839.
[42]  Smaoui, N. and Al-Enezi, S. (2004) “Modelling the dynamics of nonlinear partial differential equations using neural networks,” Journal of Computational and Applied Mathematics, vol. 170, no. 1, pp. 27–58.
[43]  Tanaskovic, M., Fagiano, L., Novara, C. and Morari, M. (2017) “Data driven control of nonlinear systems: an on-line direct approach,” Automatica, vol. 75, pp. 1–10.
[44]  Tsung, F. and Cottrell, G. (1995) “Phase-space learning”, in Advances in Neural Information Processing Systems, pp. 481–488, MIT Press.
[45]  Urbina, A., Hunter, N. and Paez, T. (1998) “Characterization of nonlinear dynamic systems using artificial neural networks,” Technical report, Sandia National Labs, Albuquerque, NM (United States).
[46]  Villmann, T., Seiffert, U. and Wismϋller, A. (2004) "Theory and Applications of Neural maps", ESANN2004 PROCEEDINGS - European Symposium on Ann, pp.25 - 38.
[47]  Wang, Z., Xiao, D., Fang, F., Govindan, R., Pain, C. and Guo, Y. (2018) “Model identification of reduced order fluid dynamics systems using deep learning,” International Journal for Numerical Methods in Fluids, vol. 86, no. 4, pp. 255–268.