American Journal of Mathematics and Statistics
p-ISSN: 2162-948X e-ISSN: 2162-8475
2013; 3(5): 268-280
doi:10.5923/j.ajms.20130305.04
A. Rajarathinam, M. Thirunavukkarasu
Department of Statistics, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu, India
Correspondence to: A. Rajarathinam, Department of Statistics, Manonmaniam Sundaranar University, Tirunelveli, Tamil Nadu, India.
| Email: | ![]() |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
This paper investigates predictive performance of fuzzy time series analysis method for paddy production trends. Parametric models such as linear, non-linear and time series analysis have been conventionally used for modeling univariate and multivariate time series data sets. However, these analyses have several limitations such as assumption of the model, stationarity, normality, randomness etc. Particularly for non linear dataset, difficulties exist in time series practice. Fuzzy time series analysis is first suggested by Song and Chissom[22, 23] and it is a time-invariant method for modeling. Rather than classical methods, there are no prerequisites like stationarity and normality, and there is no necessity for treatment of missing data. Fuzzy time series methods are applied to paddy production data set and results are reported.
Keywords: Fuzzy Time Series, Fuzzy Sets, Fuzzy Relational Equations, Linguistic Values, Linguistic Variables, Non-linear Models, Randomness, Normality
Cite this paper: A. Rajarathinam, M. Thirunavukkarasu, Fuzzy Time Series Modeling for Paddy (Oryza sativa L.) Crop Production, American Journal of Mathematics and Statistics, Vol. 3 No. 5, 2013, pp. 268-280. doi: 10.5923/j.ajms.20130305.04.
|
of U is defined by, ![]() | (1) |
is the membership function of fuzzy set
, 
denotes the membership value of
in
,
and
.Song and Chissom[22-24] presented the following definitions of the fuzzy time series.Definition 1: Fuzzy Time SeriesLet Y(t), (t = 0,1,2,…), is a subset of real number R. Let Y(t) be the universe of discourse defined by the fuzzy set
. If F(t) consists of
(i = 1, 2,…). F(t) is called a fuzzy time series on Y(t). In definition 1, F(t) can be viewed as a linguistic variables. This represents for the main difference between fuzzy time series and classical time series, whose values must be real numbers.Definition 2: Time – Invariant Fuzzy Time SeriesSuppose F(t) is caused only by F(t−1) and is denoted by F(t−1)→F(t); if there exists a fuzzy relationship between F(t) and F(t−1) can be expressed as the fuzzy relational equationF(t) = F(t−1) ◦ R(t, t−1)Here ‘‘◦’’ is max–min composition operator. The relation R is called first-order model of F(t). Further, if fuzzy relation R(t, t−1) of F(t) is independent of time t, that is to say for different times t1 and t2, R(t1, t1−1) = R(t2, t2−1), then F(t) is called a time - invariant fuzzy time series. Otherwise is called a time – variant fuzzy time series. Chen[4] revised the time-invariant models in Song and Chissom[22,23] to simplify the calculations. In addition, Chen’s method can generate more precise forecasting results than those of Song and Chissom[22,23]. Chen’s method is described as below :Step 1: Collect the historical data
.Step 2: Define the universe of discourse U.Find the maximum
, and the minimum
among all
. For easy partitioning of U, the small numbers
and
are assigned. The universe of discourse U is then defined by,![]() | (2) |
. Length of intervals significantly affects forecasting results in fuzzy time series. Hence, an effective length of intervals can significantly improve the forecasting results. The distribution based is one of the method of fuzzy time series model which can be used to adjust the lengths of intervals determined during the early stages of forecasting when the fuzzy relationship are formulated.The distribution length of interval l is computed by the following steps:1. Calculate all the absolute differences between the values
and
as the first differences, and then compute the average of the first differences.2. Take one-half of the average as the length.3. Find the located range of the length and determine the base.4. According to the assigned base, round the length as the appropriate
.
|
![]() | (3) |
and
respectively. Assume that the m intervals are
,
,
, . . . ,
,
, and
. The fuzzy numbers
can be defined as follows.
,
, . . .
,
Step 5: Fuzzify the historical data:If the value of
is located in the range of
, then it belongs to fuzzy number
. All
must be classified into the corresponding fuzzy numbers.Step 6: Generate the fuzzy logical relationships:For all fuzzified data, derive the fuzzy logical relationships based on definition 3: The fuzzy logical relationship is like
, which denotes that “if the
value of time
is
, then that of time t is
”.Step 7: Establish the fuzzy logical relationship groups:The derived fuzzy logical relationships can be arranged into fuzzy logical relationships groups based on the same fuzzy numbers on the left hand sides of the fuzzy logical relationships. The fuzzy logical relationship groups are like the following
Step 8: Calculate the forecasted outputs:The forecasted value at time t,
, is determined by the following three heuristic rules. Assume the fuzzy number of
at time
is
.Rule 1:If the fuzzy logical relationship group of
is empty;
, then the value of
is
, which is
.Rule 2: If the fuzzy logical relationship group of
is one to one;
, then the value of
is
, which is
.Rule 3:If the fuzzy logical relationship group of
is one to many
,
, . . . ,
, and then the value of
is calculated as follows.
where,
Mean Absolute Error (MAE) =
, and Mean Square Error (MSE) =
Average Forecasting Error Rate (AFER) =
where n and p are number of observations and number of parameters, respectively in the model. The lower the values of these statistics, the better were the fitted model.As pointed out by Kvalseth[12], before taking any final decision about the appropriateness of the fitted model, it is paramount importance to investigate the basic assumptions regarding the error term, viz., randomness and normality.Randomness assumption of the residuals needs to be tested before taking any final decision about the adequacy of the model developed. To carry out the above analysis “Run test” procedure is developed in the literature
where
. The parameter K takes the values
The values of coefficients “a(k)” for different values of n and k are given in table 5 (Shapiro – Wilk[20]). H0 was accepted if the value of W is very close to one.
|
![]() | Figure 1. Trends in paddy crop production based on Sinusoidal Non-linear Model |
|
and 2459
, respectively. For easy computation, let
and
. The universe of discourse U is defined as follows:
Step 3: The appropriate length of interval
can be computed as follows:1. Based on table 2, we can calculate the average of the first differences, which is 667.2. Take one-half of 667 as the length, which is 333.5.3. Since the length 333.5 is located at the range[101, 1000] in table 2, the base is assigned to be 100.4. According to the base 100, the length 333.5 is rounded off to 300, which is the appropriate length of interval
.Step 4: Use Eq. 3 to calculate the number of intervals (fuzzy numbers) as follows:
.Thus, there are 22 intervals, which are
The fuzzy numbers can be defined by
Step 5: Fuzzify the productions. For example, the paddy production in year 1951 is 2459, which is located at the range of
. Thus, the corresponding fuzzy number of year 1951 is assigned as
.table 5 lists the corresponding fuzzy number for the paddy production of each year.Step 6: According to table 5, we can derive the fuzzy logical relationships as shown in table 6. Notice that the repeated relationships are counted only once.Step 7: Based on the same fuzzy numbers on the left hand side of the fuzzy logical relationships in table 6, 17 fuzzy logical relationship groups are generated as shown in table 7.Step 8: According to tables 5 and 7, we can calculate the forecasted paddy productions. For instance, the forecasted paddy productions of years 1952 and 1954 can be illustrated below:Forecasting 1952: The fuzzified paddy production of year 1951 in table 5 is
, and from table 7, we can find that there is one fuzzy logical relationships in group 1.
.According to Rule 2, the forecasted paddy productions of year 1952 is
. Thus,
Forecasting 1953: According to table 7, we can find that there is one fuzzy logical relationships in group 1.
. The forecasted paddy production of year 1953 is
. Thus,
Forecasting 1954: Because the fuzzified paddy production of 1954 in table 5 is
, and from table 7, we can find that there are five logical relationships in group 3.
.According to Rule 3, the forecasted paddy production of year 1954 is computed as follows:
.
|
![]() | Figure 2. Trends in paddy production based on Fuzzy time series modeling |
|
|
|
|
