American Journal of Intelligent Systems
p-ISSN: 2165-8978 e-ISSN: 2165-8994
2013; 3(1): 13-19
doi:10.5923/j.ajis.20130301.02
Ozge Cagcag1, Ufuk Yolcu2, Erol Egrioglu1, CagdasHakan Aladag3
1Department of Statistics, University of Ondokuz Mayis, Samsun, 55139, Turkey
2Department of Statistics, Giresun University, Giresun, 28000, Turkey
3Department of Statistics, Hacettepe University, Ankara, 06800, Turkey
Correspondence to: Erol Egrioglu, Department of Statistics, University of Ondokuz Mayis, Samsun, 55139, Turkey.
| Email: | ![]() |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
Fuzzy time series forecasting methods have been widely studied in recent years. This is because fuzzy time series forecasting methods are compatible with flexible calculation techniques and they do not require constraints that exist in conventional time series approaches. Most of the real life time series exhibit periodical changes arising from seasonality. These variations are called seasonal changes. Although, conventional time series approaches for the analysis of time series which have seasonal effect are abundant in literature, the number of fuzzy time series approaches is limited. In almost all of these studies, membership values are ignored in the analysis process. This affects forecasting performance of the approach negatively due to the loss of information as well as posing a situation that is incompatible with the basic features of fuzzy set theory. In this study, for the first time in literature, a new seasonal fuzzy time series approach which considers membership values in both identification of fuzzy relations and defuzzification steps was proposed. In the proposed method, we used fuzzy C-means clustering method in fuzzification step and artificial neural networks (ANN) in identification of fuzzy relation and defuzzification steps which consider membership values. The proposed method was applied to various seasonal fuzzy time series and obtained results were compared with some conventional and fuzzy time series approaches. In consequence of this evaluation, it was determined that forecasting performance of the proposed method is satisfactory.
Keywords: Seasonal Fuzzy Time Series, Fuzzy C-means, Artificial Neural Network, Membership Degree, Air Pollution
Cite this paper: Ozge Cagcag, Ufuk Yolcu, Erol Egrioglu, CagdasHakan Aladag, A Novel Seasonal Fuzzy Time Series Method to the Forecasting of Air Pollution Data in Ankara, American Journal of Intelligent Systems, Vol. 3 No. 1, 2013, pp. 13-19. doi: 10.5923/j.ajis.20130301.02.
mean, than the model is expressed in equation (1)![]() | (1) |
![]() | (2) |
![]() | (3) |
![]() | (4) |
![]() | (5) |
can be obtained from Box and Jenkins[15].
observation and each fuzzy cluster has a set center
. The memberships of the observations are described by a fuzzy matrix
with
rows and
columns in which
is the number of data objects and
is the number of clusters.
, the element in the
row and jth column in
, indicates the degree of association or membership function value of the ith object with the jth cluster. The characters of
are as follows:![]() | (6) |
![]() | (7) |
![]() | (8) |
![]() | (9) |
![]() | (10) |
is a scalar termed the weighting exponent and controls the fuzziness of the resulting clusters and
is the Euclidian distance from object
to the cluster center
. In this method, minimizing is done by an iterative algorithm. In each repetition the values of
and
are updated by the formulas given in equation (11) and equation (12).![]() | (11) |
![]() | (12) |
![]() | Figure 1. Architecture of multilayer feed forward neural network |
, a subset of real numbers, be the universe of discourse by which fuzzy sets
are defined. If
is a collection of
then
is called a fuzzy time series defined on
.Definition 2 First order seasonal fuzzy time series forecasting model.Let
be a fuzzy time series. Assume there exists seasonality in
, first order seasonal fuzzy time series forecasting model:![]() | (13) |
be a fuzzy time series. If
is caused by
and
, then this fuzzy logical relationship is represented by![]() | (14) |
,
and
. A bivariate fuzzy logical relationship is defined as
,
, where
are referred to as the left hand side and
as the right hand side of the bivariate fuzzy logical relationship. Therefore, first order bivariate fuzzy time series forecasting model is as follows:![]() | (15) |
is caused by
, where 
and
are integers
then this FLR is represented by;![]() | (16) |
. As an illustration let us suppose we have defined the model as SARIMA (1,1,0)(0,1,1)12 via Box-Jenkins method. This implies that
will be a linear combination of the corresponding lagged variables. That is,![]() | (17) |
representing the order of the model and the parameters
are determined based on the inputs of the SARIMA model. Accordingly
and
are defined as 5 and 1 respectively. Then the model will be
-order partial bivariate fuzzy time series forecasting model and the fuzzy relationship can be given as follow;![]() | (18) |

denotes the fuzzified time series
and
denotes the fuzzified residual series
.Step 2 Data set of lagged variables is created.Depending on the model order defined in previous step, for each time series which should be included in the model
, and residual series
for each lagged variables are lagged less than order of lagged variables and data set is created. In other words, when a model given in equation (18) is considered, lagged variables data set will include
.Step 3 Data set of lagged variables is clustered via FCM.The number of fuzzy set is determined with
where
and
is the number of observation. Data set which covers the delays in times series is clustered via FCM clustering method. Thus, fuzzy set centers for each lagged variables constituting data set and membership values showing order of observations belonging to fuzzy sets for each observation are obtained. In this step, fuzzy sets are sorted according to set centers represented with
and
fuzzy sets are obtained.Step 4 Fuzzy relations are determined via Feed Forward Artificial Neural Networks (ANN).The number of neurons in input and output layer of feed forward artificial neural network used in determining fuzzy relations equals to number of fuzzy set
. The number of neurons in hidden layer is determined by trial and error. Here, the point to take into consideration is that hidden layer unit number should be selected in a way that not losing generalization ability of feed forward artificial neural network. The architecture of feed forward artificial neural network having two hidden layers for a model including seven sets is presented in Figure 2. In Figure 2,
representsthe membership value of lagged data set belonging to
fuzzy set at
time. Moreover, while membership value of observation of lagged data set belonging to
number fuzzy set at
time constitutes the inputs of ANN; membership value of observation of lagged data set belonging to
number fuzzy set at
time constitutes the outputs of ANN.In all layers of feed forward artificial neural networks which is used in determining fuzzy relation and whose architectural structure is exemplified above, logistic activation function given in (19) equation is used.![]() | (19) |
![]() | Figure 2. Architecture of feed forward artificial neural network for three sets |
time, membership values of observations belonging to fuzzy sets at
time depending on
fuzzy set center which was obtained from FCM method were determined and then these membership values were entered to feed forward artificial neural networks as inputs and thus outputs of feed forward artificial neural networks are created. These outputs represent forecast of observation at
time. A architecture of feed forward artificial neural network for three sets is given Figure 3.![]() | Figure 3. A architecture of feed forward artificial neural network in defuzzification |
time series are obtained. In this step, optimal model for ANSO time series wasSARIMA (1,1,0)(0,1,1)12. As a linear function of
, this model can be expressed as;![]() | (20) |
order partial high order fuzzy time series forecasting model where
and
. This model can be expressed as;![]() | (21) |
![]() | Figure 4. The time series data of the amount of SO2 in Ankara |
order partial model is created using
lagged variables. Here, it must be noted that lagged variables data set consists of one step leaded variable in partial high order fuzzy time series forecasting model given in (20). Created data set is clustered via FCM. Clustering is applied to all lagged variable data sets together. In this step, data set is clustered by shifting the number of sets 5 to 15. Membership values of observations belonging to each fuzzy set are also determined via FCM method. The relationship between these membership values, in other words, the number of neurons in hidden layer of feed forward artificial neuron network which is used in determining fuzzy relation were shifted between 1 and 15. In the light of this information,
different analyses were done and Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE) were used as performance evaluation criteria.![]() | (22) |
![]() | (23) |
, and
,
represent crisp time series, defuzzified forecasts, and the number of forecasts, respectively. The algorithm of the proposed method is coded in Matlab version 7.9.In consequence of all analyses, the best forecasting performance was obtained in the case in which the number of set is 14, the hidden layer unit number is 6 in the determination of fuzzy relation stage and the hidden layer unit number is 2 in the defuzzification stage. Results obtained from the proposed method and results of some other methods are summarized in Table 1.Table 1 clearly shows the superior performance of the proposed method in comparison with conventional time series approaches as well as seasonal fuzzy time series approaches with respect to three criteria. Additionally, graph of the forecasts obtained from the proposed model with real values are given in Figure 5. When Table 1 and Figure 5 are analyzed together, all the advantages as well as the superior forecasting performance of the proposed method can be seen easily.
|
![]() | Figure 5. The graph of the results obtained from the proposed method and real time series |