Algorithms of Credible Intervals from Generalized Extreme Value Distribution Based on Record Data

Mohamed A. El-Sayed; M. M. Mohie El-Din; Samia Danial; Fathy H. Riad

Paper Information
Paper Submission

International Journal of Statistics and Applications

p-ISSN: 2168-5193 e-ISSN: 2168-5215

2017; 7(4): 215-221

doi:10.5923/j.statistics.20170704.03

Algorithms of Credible Intervals from Generalized Extreme Value Distribution Based on Record Data

Abstract
Reference
Full-Text PDF
Full-text HTML

Mohamed A. El-Sayed^{1, 2}, M. M. Mohie El-Din³, Samia Danial⁴, Fathy H. Riad⁵

¹Department of Mathematics, Faculty of Science, Fayoum University, Egypt

²Department of Computer Science, College of Computers and IT, Taif University, KSA

³Department of Mathematics, Faculty of Science, Al-Azhar University, Cairo, Egypt

⁴Department of Mathematics, Faculty of Science, South Valley University, Qena, Egypt

⁵Department of Mathematics, Faculty of Science, Minia University, Egypt

Correspondence to: Mohamed A. El-Sayed, Department of Mathematics, Faculty of Science, Fayoum University, Egypt.

Email:

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

The paper is focused on an algorithm of the maximum likelihood and Bayes estimates of the generalized extreme value (GEV) distribution based on record values. The asymptotic confidence intervals as well as bootstrap confidence are proposed. The Bayes estimators cannot be obtained in explicit form so the Markov Chain Monte Carlo (MCMC), methods; Gibbs sampling algorithm, and Metropolis algorithm are used to calculate Bayes estimates as well as the credible intervals. Also, the algorithm based on bootstrap method for estimating the confidence intervals is used. A numerical example is provided to illustrate the proposed estimation methods developed here. Comparing the models, the MSEs, average confidence interval lengths of the MLEs and Bayes estimators for parameters are less significant for censored models.

Keywords: MCMC, GEV Distribution, Record values, MLE, Bayesian estimation

Cite this paper: Mohamed A. El-Sayed, M. M. Mohie El-Din, Samia Danial, Fathy H. Riad, Algorithms of Credible Intervals from Generalized Extreme Value Distribution Based on Record Data, International Journal of Statistics and Applications, Vol. 7 No. 4, 2017, pp. 215-221. doi: 10.5923/j.statistics.20170704.03.

Article Outline

1. Introduction

2. MCMC Algorithms

2.1. Gibbs Sampler

2.2. The Metropolis-Hastings Algorithm

3. Maximum Likelihood Estimation

4. Approximate Interval Estimation

5. Bootstrap Confidence Intervals

6. Bayesian Estimation

7. Data Analysis

8. Conclusions

ACKNOWLEDGEMENTS

1. Introduction

For many systems, their states are governed by some probability models. For example in statistical physics, the microscopic states of a system follows a Gibbs model given the macroscopic constraints. The fair samples generated by MCMC will show us what states are typical of the underlying system. In computer vision, this is often called "synthesis", the visual appearance of the simulated images, textures, and shapes, and it is a way to verify the sufficiency of the underlying model. On other hand, record values arise naturally in many real life applications involving data relating to sport, weather and life testing studies. Many authors have been studied record values and associated statistics, for example, Ahsanullah ([1], [2], [3]), Arnold and Balakrishnan [4], Arnold, et al. ([5], [6]), Balakrishnan and Chan ([7], [8]) and David [9]. Also, these studies attracted a lot of attention see papers Chandler [10], Galambos [11].

In general, the joint probability density function (pdf) of the first m lower record values

is given by

(1)

The GEV distribution is a family of continuous probability distributions developed within extreme value theory. Extreme value theory provides the statistical framework to make inferences about the probability of very rare or extreme events. The GEV distribution unites the Gumbel, Fr´echet and Weibull distributions into a single family to allow a continuous range of possible shapes. These three distributions are also known as type I, II and III extreme value distributions. The GEV distribution is parameterized with a shape parameter, location parameter and scale parameter. The GEV is equivalent to the type I, II and III, respectively, when a shape parameter is equal to 0, greater than 0, and lower than 0. Based on the extreme value theorem the GEV distribution is the limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. Thus, the GEV distribution is used as an approximation to model the maxima of long (finite) sequences of random variables. Frechet [12] and Fisher [13] publishing result of an independent inquiry into the same problem. The Extreme lower bound distribution is a kind of general extreme value (the Gumbel-type I, extreme lower bound [Frechet]-typeII and Weibull distribution type III extreme value distributions). The applications of the extreme lower bound [Frechet]-type II turns out to be the most important model for extreme events the domain of attraction condition for the Frechet takes on a particularly easy from. In probability theory and statistics, the GEV distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Frechet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. So, the GEV distribution is used as an approximation to model the maxima of long (finite) sequences of random variables. In some fields of application the generalized extreme value distribution is known as the Fisher-Tippett distribution, named after R. A. Fisher and L. H. C. Tippett who recognized three function forms outlined below. However usage of this name is sometimes restricted to mean the special case of the Gumbel distribution. The (pdf) and (cdf) of x are given respectively:

(2)

and

(3)

where

is the shape parameter,

is the scale parameter and

is the location parameter.

In this paper is organized in the following order: Section 2 provides Markov chain Monte Carlo’s algorithms. The maximum likelihood estimates of the parameters of the GEV distribution, the point and interval estimates of the parameters, as well as the approximate joint confidence region are studied in sections 3 and 4. The parametric bootstrap confidence intervals of parameters are discussed in section 5. Bayes estimation of the model parameters and Gibbs sampling algorithm are provided in section 6. Data analysis and Monte Carlo simulation results are presented in section 7. Section 8 concludes the paper.

2. MCMC Algorithms

Markov chain Monte Carlo (MCMC) methods (which include random walk Monte Carlo methods) are a class of algorithms for sampling from probability distributions based on constructing a Markov chain that has the desired distribution as its equilibrium distribution. As computers became more widely available, the Metropolis algorithm was widely used by chemists and physicists, but it did not become widely known among statisticians until after 1990. Hastings (1970) generalized the Metropolis algorithm, and simulations following his scheme are said to use the Metropolis–Hastings algorithm. A special case of the Metropolis–Hastings algorithm was introduced by Geman and Geman (1984), apparently without knowledge of earlier work. Simulations following their scheme are said to use the Gibbs sampler. The state of the chain after a large number of steps is then used as a sample of the desired distribution. The quality of the sample improves as a function of the number of steps. MCMC techniques methodology provides a useful tool for realistic statistical modelling (Gilks et al. [14]; Gamerman, [15]), and has become very popular for Bayesian computation in complex statistical models. Bayesian analysis requires integration over possibly high-dimensional probability distributions to make inferences about model parameters or to make predictions. MCMC is essentially Monte Carlo integration using Markov chains. The integration draws samples from the required distribution, and then forms sample averages to approximate expectations (see Geman and Geman, [16]; Metropolis et al., [17]; Hastings, [18]).

2.1. Gibbs Sampler

The Gibbs sampling algorithm is one of the simplest Markov chain Monte Carlo algorithms. The paper by Gelfand and Smith [19] helped to demonstrate the value of the Gibbs algorithm for a range of problems in Bayesian analysis. Gibbs sampling is a MCMC scheme where the transition kernel is formed by the full conditional distributions.

The Gibbs sampler is a conditional sampling technique in which the acceptance-rejection step is not needed. The Markov transition rules of the algorithm are built upon conditional distributions derived from the target distribution. The conditional posterior usually is but does not have to be one-dimensional.

2.2. The Metropolis-Hastings Algorithm

The Metropolis algorithm was originally introduced by Metropolis et. al [17]. Suppose that our goal is to draw samples from some distributions

, where

is the normalizing constant which may not be known or very difficult to compute. The Metropolis-Hastings (MH) algorithm provides a way of sampling from

without requiring us to know

. Let

be an arbitrary transition kernel: that is the probability of moving, or jumping, from current state

. This is sometimes called the proposal distribution. The following algorithm will generate a sequence of the values

, ... which form a Markov chain with stationary distribution given by

If the proposal distribution is symmetric, for all possible

and

, so

, in particular, we have

, so that the acceptance probability (5) is given by:

(5)

3. Maximum Likelihood Estimation

Let

be m lower record values each of which has the generalized extreme value whose the pdf and cdf are, respectively, given by (2) and (3). Based on those lower record values and for simplicity of notation, we will use

instead of

. The logarithm of the likelihood function may then be written as [20-23]:

(6)

where

with known

. Calculating the first partial derivatives of Eq. (6) with respect to

and

equating each to zero, we get the likelihood equations as:

(7)

and

(8)

By solving the two nonlinear equations (7) and (8) numerically, we obtain the estimates for the parameters

and

say

and

Records are rare in practice and sample sizes are often very small, therefore, intervals based on the asymptotic normality of MLEs do not perform well. So two confidence intervals based on the parametric bootstrap and MCMC methods are proposed.

4. Approximate Interval Estimation

If sample sizes are not small. The Fisher information matrix

is then obtained by taking expectation of minus of the second derivatives of the logarithm likelihood function. Under some mild regularity conditions,

is approximately bivariately normal with mean

and covariance matrix

. In practice, we usually estimate

. A simpler and equally veiled procedure is to use the approximation

(9)

where

is observed information matrix given by

(10)

where the elements of the Fisher information matrix are given by

(11)

(12)

and

(13)

Approximate confidence intervals for

and

can be found by to be bivariately normal distributed with mean

and covariance matrix

Thus, the

approximate confidence intervals for

and

are:

(14)

respectively, where

and

are the elements on the main diagonal of the covariance matrix

and

is the percentile of the standard normal distribution with right-tail probability

5. Bootstrap Confidence Intervals

In this section, we propose to use percentile bootstrap method based on the original idea of Efron [24]. The algorithm for estimating the confidence intervals of

and

using this method are illustrated below.

6. Bayesian Estimation

In this section, we are in a position to consider the Bayesian estimation of the parameters

and

for record data, under the assumption that the parameter

is known. We may consider the joint prior density as a product of independent gamma distribution

and

, given by

(17)

and

(18)

By using the joint prior distribution of

and likelihood function, the joint posterior density function of

and

given the data, denoted by

can be written as

(19)

As expected in this case, the Bayes estimators can't be obtained in closed form. We propose to use the Gibbs sampling procedure to generate MCMC samples, we obtain the Bayes estimates and the corresponding credible intervals of the unknown parameters. A wide variety of MCMC schemes are available, and it can be difficult to choose among them. An important sub-class of MCMC methods are Gibbs sampling and more general Metropolis-within-Gibbs samplers.

It is clear that the posterior density function of

given

(20)

and the posterior density function of

given

can be written as

(21)

The plots of them show that they are similar to normal distribution. So to generate random numbers from these distributions, we use the Metropolis-Hastings method with normal proposal distribution. Therefore the algorithm of Gibbs sampling procedure as the following algorithm [23]:

7. Data Analysis

Now, we describe choosing the true values of parameters

and

with known prior. For given

generate random sample of size 100, from gamma distribution, then the mean of the random sample

, can be computed and considered as the actual population value of

That is, the prior parameters are selected to satisfy

is approximately the mean of gamma distribution. Also for given values

, generate according the last

, from gamma distribution. The prior parameters are selected to satisfy

is approximately the mean of gamma distribution. By using

, we generate lower record value data from generalized extreme lower bound distribution the simulate data set with

, given by: 29.7646, 4.9186, 3.8447, 2.5929, 2.3330, 2.2460, 2.2348.

Under this data we compute the approximate MLEs, bootstrap and Bayes estimates of

and

using MCMC method, the MCMC samples of size 10000 with 1000 as 'burn-in'. The results of point estimation are displayed in Table 1 and results of interval estimation given in Table 2.

Table 1. The point estimates of parameters, σ and λ with θ = 3.5

Table 2. Two-sided 95% confidence intervals ( , ) and its length

of parameters σ and λ

Figure 1. Simulation number of σ generated by MCMC method

Figure 2. Simulation number of λ generated by MCMC method

In general, one step of Gibbs sampler (GS) requires more work than that of the Metropolis-Hastings (M-H) algorithm, since the former is likely to require more point evaluations of the posterior density. However, subsequent points produced by GS are usually less mutually correlated than those produced by M-H, i.e. the sample ensemble of a given size is typically better distributed according to the posterior in the case of GS than that of M-H. Sampling from a conditional density in Gibbs Sampler typically requires finding the essential part of the density due to which implementation can be difficult.

8. Conclusions

In the paper several algorithms of estimation of GEV distribution under the progressive Type II censored sampling plan are investigated. The asymptotic confidence intervals as well as bootstrap confidence are studied. The approximate confidence intervals, percentile bootstrap confidence intervals, as well as approximate joint confidence region for the parameters are expanded and developed. Some numerical examples with actual data set and simulated data are used to compare the proposed joint confidence intervals. The parts of MSEs and credible intervals lengths, the estimators of Bayes depend on non-informative implement more effective than the MLEs and bootstrap.

ACKNOWLEDGEMENTS

The authors are grateful to anonymous referees who helped us to improve the presentation and their valuable suggestions, which improve the quality of the paper.

References

[1]	Ahsanullah, M. Linear prediction of record values for two parameter exponential distribution. Ann. Inst. Statist. Math., 32: 363 - 368., 1980.
[2]	Ahsanullah, M. Estimation of the parameters of the Gumbel distribution based on the m record values. Comput. Statist. Quart., 3: 231 - 239, 1990.
[3]	Ahsanullah, M. Record Values, In The Exponential Distribution: Theory, Methods and Applications. Eds., N. Balakrishnan and A.P. Basu, Gordon and Breach Publishers, New York, New Jersey, 1995.
[4]	Arnold, B.C., Balakrishnan N. Relations, Bounds and Approximations for Order Statistics. Lecture Notes in Statistics 53, Springer-Verlag, New York, 1989.
[5]	Arnold, B.C., Balakrishnan, N., Nagaraja, H.N. A First Course in Order Statistics. John Wiley, Sons, New York., 1992.
[6]	Arnold, B.C., Balakrishnan, N., Nagaraja H.N., Record. John Wiley, Sons, New York, 1998.
[7]	Balakrishnan, N., Chan, A.C. Order Statistics and Inference: Estimation Methods. Academic Press, San Diego, 1993.
[8]	Balakrishnan, N.,. Chan, P.S. Record values from Rayleigh and Weibull distributions and associated inference. National Institute of Standards and Technology Journal of Research, Special Publications, 18, 866: 41 - 51, 1993.
[9]	Chandler, K.N. The distribution and frequency of record values. J. Roy. Statist. Soc. Ser., Second edition, B14: 220 - 228, 1952.
[10]	David, H.A. Order Statistics. Second edition, John Wiley, Sons, New York., 1981.
[11]	Galambos. J., The Asymptotic Theory of Extreme Order Statistics. John Wiley, Sons, New York. Krieger, Florida,, Second edition,1987.
[12]	Frechet, M. Sur la loi de probabilite de l'ecarrt maximum, Annales de la Societe Polonaise de Mathematique. Cracovie , 6: 93 - 116, 1927.
[13]	Fisher, R.A., Tippett. L.H.C. Limiting forms of the frequency distribution of the largest or smallest member of sample. Proceeding of the Cambridge Philosophical Society, 24: 180 - 190, 1928.
[14]	Gilks, W.R., Richardson, S., Spiegelhalter, D.J., Markov chain Monte Carlo in Practices. Chapman and Hall, London, 1996.
[15]	Gamerman, D., Markov chain Monte Carlo: Stochastic Simulation for Bayesian Inference. Chapman and Hall, London, 1997.
[16]	Geman, S., Geman, D., Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Mathematical Intelligence 6, 721-741, 1984.
[17]	Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, EEquations of state calculations by fast computing machines. Journal Chemical Physics 21, 1087-1091, 1953.
[18]	Hastings, W.K., Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57, 97-109, 1970.
[19]	Gelfand, A.E., Smith, A.F.M., Sampling based approach to calculating marginal densities. Journal of the American Statistical Association 85, 398-409, 1990.
[20]	El-Din, M.M., Riad, F.H. and El-Sayed, M.A. Confidence Intervals for Parameters of IWD Based on MLE and Bootstrap. Journal of Statistics Applications & Probability, 3, 1-7, 2014.
[21]	El-Din, M.M., Riad, F.H. and El-Sayed, M.A. Statistical Inference and Prediction for the Inverse Weibull Distribution Based on Record Data. International Journal of Advanced Statistics and Probability, 3, 171-177. 2014.
[22]	El-Sayed, M.A., Riad, F.H., Elsafty, M.A. and Estaitia, Y.A. Algorithms of Confidence Intervals of WG Distribution Based on Progressive Type-II Censoring Samples. Journal of Computer and Communications, 5, 101-116, 2017.
[23]	Elhag, A.A., Ibrahim, O.I., El-Sayed, M.A. and Abd-Elmougod, G.A. Estimations of Weibull-Geometric Distribution under Progressive Type II Censoring Samples. Open Journal of Statistics, 5, 721-729., 2015.
[24]	Eforon, B. Censored data and bootstrap. Journal of the American statistical association 76, 312-319. 1981.

Paper Information

Journal Information

Algorithms of Credible Intervals from Generalized Extreme Value Distribution Based on Record Data

Article Outline

1. Introduction

2. MCMC Algorithms

2.1. Gibbs Sampler

2.2. The Metropolis-Hastings Algorithm

3. Maximum Likelihood Estimation

4. Approximate Interval Estimation

5. Bootstrap Confidence Intervals

6. Bayesian Estimation

7. Data Analysis

8. Conclusions

ACKNOWLEDGEMENTS

References