International Journal of Statistics and Applications

p-ISSN: 2168-5193    e-ISSN: 2168-5215

2017;  7(6): 289-297

doi:10.5923/j.statistics.20170706.03

 

Size-Biased Poisson-Akash Distribution and Its Applications

Rama Shanker

Department of Statistics, College of Science, Eritrea Institute of Technology, Asmara, Eritrea

Correspondence to: Rama Shanker, Department of Statistics, College of Science, Eritrea Institute of Technology, Asmara, Eritrea.

Email:

Copyright © 2017 Scientific & Academic Publishing. All Rights Reserved.

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

A size-biased Poisson-Akash distribution (SBPAD) has been proposed by size-biasing the Poisson-Akash distribution (PAD) of Shanker (2017), a Poisson mixture of Akash distribution introduced by Shanker (2015). The first four moments about origin and moments about mean have been obtained and hence expressions for coefficient of variation (C.V.), skewness, kurtosis and index of dispersion have been given. The estimation of its parameter has been discussed using the method of moments and the method of maximum likelihood estimation. Two examples of observed real datasets have been presented to test the goodness of fit of SBPAD over size-biased Poisson distribution (SBPD) and size-biased Poisson-Lindley distribution (SBPLD).

Keywords: Size-biased distribution, Poisson-Akash distribution, Moments and moments based measures, Estimation of parameter, Goodness of fit

Cite this paper: Rama Shanker, Size-Biased Poisson-Akash Distribution and Its Applications, International Journal of Statistics and Applications, Vol. 7 No. 6, 2017, pp. 289-297. doi: 10.5923/j.statistics.20170706.03.

1. Introduction

Let a random variable has probability distribution . If sample units are weighted or selected with probability proportional to , then the corresponding size-biased distribution of order is given by its probability mass function (pmf)
(1.1)
where . When , the distribution is known as simple size-biased distribution and is applicable for size-biased sampling and for , the distribution is known as area-biased distribution and is applicable for area-biased sampling. In many statistical sampling situations care must be taken so that one does not inadvertently sample from size-biased distribution in place of the one intended.
Size-biased distributions are a particular case of weighted distributions which arise naturally in practice when observations from a sample are recorded with probability proportional to some measure of unit size. In field applications, size-biased distributions can arise either because individuals are sampled with unequal probability by design or because of unequal detection probability. Size-biased distributions come into play when organisms occur in groups, and group size influences the probability of detection. Fisher (1934) firstly introduced these distributions to model ascertainment biases which were later formalized by Rao (1965) in a unifying theory for problems where the observations fall in non-experimental, non-replicated and non-random categories. Size-biased distributions have applications in environmental science, econometrics, social science, biomedical science, human demography, ecology, geology, forestry etc. Further, size-biasing occurs in many unexpected context such as statistical estimation, renewal theory, infinite divisibility of distributions and number theory. Van Duesen (1986) has detailed study about the applications of size-biased distributions for fitting distributions of diameter at breast height (DBH) data arising from horizontal point sampling (HPS). Later, Lappi and Bailey (1987) have applied size-biased distributions to analyze HPS diameter increment data. The applications of size-biased distributions to the analysis of data relating to human population and ecology can be found in Patil and Rao (1977, 1978). A number of research have been done relating to size-biased distributions and their applications in different fields of knowledge by different researchers including Scheaffer (1972), Patil and Ord (1976), Singh and Maddala (1976), Patil (1981), McDonald (1984), Gove (2000, 2003), Correa and Wolfson (2007), Drummer and McDonald (1987), Ducey (2009), Alavi and Chinipardaz (2009), Ducey and Gove (2015), are some among others.
Shanker (2015) introduced one parameter Akash distribution having probability density function (pdf) and cumulative distribution function (cdf)
(1.2)
(1.3)
for modeling various lifetime data from biomedical science and engineering. Shanker (2015) has discussed its various interesting and important properties, estimation of parameter and applications.
Assuming the parameter of the Poisson distribution to follow Akash distribution (1.2), Shanker (2017) introduced Poisson-Akash distribution (PAD), a Poisson mixture of Akash distribution, having pmf
(1.4)
Various statistical properties of PAD, estimation of parameter and applications to model count data have been studied by Shanker (2017) and it has been observed that it gives better fit than both Poisson distribution and Poisson-Lindley distribution, a Poisson mixture of Lindley (1958) distribution and introduced by Sankaran (1970). The first four moments about origin and the variance of PAD obtained by Shanker (2017) are given by

2. Size-Biased Poisson-Akash Distribution

Using (1.1) and (1.4) and the expression for the mean of PAD, the size-biased Poisson-Akash distribution (SBPAD) with parameter can be defined by its pmf
(2.1)
Recall that the pmf of SBPAD (2.1) can also be obtained from the size-biased Poisson distribution (SPBD) with pmf
(2.2)
when its parameter follows size-biased Akash distribution (SBAD) with pdf
(2.3)
Thus the pmf of SBPAD can be obtained as
(2.4)
which is the pmf of SBPAD obtained in (2.1).
It can be easily verified that SBPAD is unimodal and have increasing failure rate. Since
is a deceasing function of is log-concave. Therefore, SBPAD is unimodal, has an increasing failure rate (IFR), and hence increasing failure rate average (IFRA). It is new better than used in expectation (NBUE) and has decreasing mean residual life (DMRL). The definitions, concepts and interrelationship between these aging concepts have been discussed in Barlow and Proschan (1981).
The graphs of the pmf of SBPAD (2.1) for varying values of the parameter have been drawn in figure 1. The graphs have been shown for both starting from and to see the difference in the nature. The graphs starting from are positively biased whereas graphs starting from are monotonically decreasing except for .
Figure 1. Graphs of pmf of SBPAD for varying values of the parameter θ
It would be recalled that the pmf of size-biased Poisson-Lindley distribution (SBPLD) given by
(2.5)
has been introduced by Ghitany and Mutairi (2008), which is a size-biased version of Poisson-Lindley distribution (PLD) introduced by Sankaran (1970). Ghitany and Mutairi (2008) have discussed its various mathematical and statistical properties, estimation of the parameter using maximum likelihood estimation and the method of moments, and goodness of fit. Shanker et al (2015) has critical study on the applications of SBPLD for modeling data on thunderstorms and found that SBPLD is a better model for thunderstorms than size-biased Poisson distribution (SBPD).

3. Moments

Using (2.4), the rth factorial moment about origin of the SBPAD (2.1) can be obtained as
Taking , we get
Using gamma integral and a little algebraic simplification, the rth factorial moment about origin of SBPAD (2.1) can be obtained as
(3.1)
Taking in (3.1), the first four factorial moments about origin can be obtained and using the relationship between moments about origin and factorial moments about origin, the first four moments about origin of the SBPAD (2.1) are thus obtained as
Now, using the relationship between moments about mean and the moments about origin, the moments about mean of the SBPAD (2.1) can be obtained as
The coefficient of variation , coefficient of Skewness , coefficient of Kurtosis and index of dispersion of the SBPAD (2.1) are thus obtained as
The graphs of coefficient of variation , coefficient of Skewness , coefficient of Kurtosis and index of dispersion of the SBPAD are shown in figure 2. From fig. 2, it is obvious that C.V and index of dispersion are monotonically decreasing whereas coefficient of skewness and coefficient of kurtosis are monotonically increasing for increasing values of the parameter .
Figure 2. Graphs of C.V, coefficient of Skewness, coefficient of Kurtosis and index of dispersion of the SBPAD for varying values of the parameter θ
It can be easily verified that SBPAD is over-dispersed , equi-dispersed and under-dispersed for . It should be noted that SBPLD is over-dispersed , equi-dispersed and under-dispersed for .

4. Estimation of Parameter

4.1. Method of Moment Estimate (MOME): Equating the population mean to the corresponding sample mean, the method of moment estimate (MOME) of of SBPAD (2.1) is the solution of the following cubic equation in
, where is the sample mean.
4.2. Maximum Likelihood Estimate (MLE): Let be a random sample of size n from the SBPAD (2.1) and let be the observed frequency in the sample corresponding to such that , where is the largest observed value having non-zero frequency. The likelihood function of the SBPAD (2.1) is given by
The log likelihood function can be obtained as
The first derivative of the log likelihood function is thus given by
where is the sample mean.
The maximum likelihood estimate (MLE), of of SBPAD (2.1) is the solution of the equation and is given by the solution of the following non-linear equation
This non-linear equation can be solved by any numerical iteration methods such as Newton- Raphson method, Bisection method, Regula –Falsi method etc. In the present paper, Newton-Raphson method has been used to solve the above non-linear equation to find MLE of the parameter. Note that the MLE of the parameter θ is the local solution and it does not matter much because accuracy of the estimate has been considered.

5. Goodness of Fit

In this section the goodness of fit of SBPAD, SBPLD and SBPD has been presented for two count datasets. The fit of these distributions are based on maximum likelihood estimates of the parameter. The first dataset is immunogold assay data of Cullen et al. (1990) regarding the distribution of number of counts of sites with particles from immunogold assay data, the second dataset is animal abundance data of Keith and Meslow (1968) regarding the distribution of snowshoe hares captured over 7 days.
The fitted plots of the SBPD, SBPLD and SBPAD for datasets in table 1 and 2 are shown in figure 3. From the fitted plot of the distributions, it is also obvious that SBPAD is closer to the observed values than SBPD and SBPLD and hence SBPAD has advantage over these distributions for modeling count data excluding zero counts.
Table 1. Distribution of number of counts of sites with particles from Immunogold data
     
Table 2. Distribution of snowshoe hares captured over 7 days
     
Figure 3. Fitted plots of the SBPD, SBPLD and SBPAD for datasets in table 1 and 2

6. Concluding Remarks

In the present paper size-biased Poisson-Akash distribution (SBPAD), a simple size-biased version of the Poisson-Akash distribution (PAD) of Shanker (2017), has been proposed and studied. Its raw moments and central moments have been obtained and hence expressions for coefficient of variation (C.V.), skewness, kurtosis and index of dispersion have been presented and their natures have been discussed graphically. The estimation of its parameter has been discussed using the method of moments and the method of maximum likelihood estimation. The goodness of fit of the SBPAD has been discussed with two examples of observed real datasets over SBPD and SBPLD and the fit given by SBPAD gives quite satisfactory fit. Therefore, SBPAD can be considered an important distribution for modeling count data excluding zero counts over SBPD and SBPLD.

ACKNOWLEDGEMENTS

The author is grateful to the Editor-In-Chief of the journal and the anonymous reviewer for their constructive suggestions which improved the presentation and the quality of the paper.

References

[1]  Alavi, S.M.R. and Chinipardaz, R. (2009): Form-invariance under weighted sampling, Statistics, 43, 81-90.
[2]  Barlow, R.E. and Proschan, F. (1981): Statistical Theory of Reliability and Life Testing, Silver Spring, MD.
[3]  Cullen, M.J., Walsh, J., Nicholson, L.V., and Harris, J.B. (1990): Ultrastructural localization of dystrophin in human muscle by using gold immunolabelling, Proceedings of the Royal Society of London, 20, 197-210.
[4]  Correa, J.A. and Wolfson, D.B. (2007): Length-bias: some Characterizations and applications, Journal of Statistical Computation and Simulation, 64, 209-219.
[5]  Drummer, T.D. and MacDonald, L.L. (1987): Size biased in line transect sampling, Biometrics, 43, 13-21.
[6]  Ducey, M.J. (2009): Sampling trees with probability nearly proportional to biomass, For. Ecol. Manage., 258, 2110-2116.
[7]  Ducey, M.J. and Gove, J.H. (2015): Size-biased distributions in the generalized beta distribution family, with applications to forestry, Forestry- An International Journal of Forest Research. 88, 143-151.
[8]  Fisher, R.A. (1934): The effects of methods of ascertainment upon the estimation of frequencies, Ann. Eugenics, 6, 13-25.
[9]  Ghitany, M.E. and Al-Mutairi, D.K. (2008): Size-biased Poisson-Lindley distribution and Its Applications, Metron - International Journal of Statistics, LXVI (3), 299-311.
[10]  Gove, J.H. (2000): Some observations on fitting assumed diameter distributions to horizontal point sampling data, Can. J. For. Res., 30, 521-533.
[11]  Gove, J.H. (2003): Estimation and applications of size-biased distributions in forestry. In Modeling Forest Systems. A Amaro, D. Reed and P. Soares (Eds), CABI Publishing, pp. 201-212.
[12]  Keith, L.B. and Meslow, E.C. (1968): Trap Response by snowshoe hares, Journal of Wildlife Management, 20, 795-801.
[13]  Sankaran, M. (1970): The discrete Poisson-Lindley distribution, Biometrics, 26, 145-149.
[14]  Lappi, J. and bailey, R.L. (1987): Estimation of diameter increment function or other tree relations using angle-count samples, Forest science, 33, 725-739.
[15]  Lindley, D.V. (1958): Fiducial distributions and Bayes theorem, Journal of the Royal Statistical Society, 20 (1), 102-107.
[16]  MacDonald, J.B. (1984): Some generalized functions for the size distribution of income, Econometrics, 52, 647-664.
[17]  Patil, G.P. (1981): Studies in statistical ecology involving weighted distributions. In Applications and New Directions, J.K. Ghosh and J. Roy (eds). Proceeding of Indian Statistical Institute. Golden Jubliee, Statistical Publishing society, pp. 478-503.
[18]  Patil, G.P. and Ord, J.K. (1976): On size-biased sampling and related form-invariant weighted distributions, Sankhya Ser. B, 38, 48-61.
[19]  Patil, G.P. and Rao, C.R. (1977): The Weighted distributions: A survey and their applications. In applications of Statistics (Ed P.R. Krishnaiah0, 383-405, North Holland Publications Co., Amsterdam.
[20]  Patil, G.P. and Rao, C.R. (1978): Weighted distributions and size-biased sampling with applications to wild-life populations and human families, Biometrics, 34, 179-189.
[21]  Rao, C.R. (1965): On discrete distributions arising out of methods of ascertainment In: Patil, G.P.(eds) Classical and Contagious Discrete Distributions. Statistical Publishing Society, Calcutta, 320-332.
[22]  Scheaffer, R.L. (1972): Size-biased sampling, Technometrics, 14, 635-644.
[23]  Singh, S.K. and Maddala, G.S. (1976): A function for the size distribution of incomes, Econometrica, 44, 963-970.
[24]  Shanker, R. (2015): Akash distribution and its applications, International Journal of Probability and Statistics, 4(3), 65-75.
[25]  Shanker, R. (2017): The discrete Poisson-Akash distribution, International Journal of probability and Statistics, 6(1), 1-10.
[26]  Shanker, R., Hagos, F. and Abrehe, Y. (2015): On Size –Biased Poisson-Lindley Distribution and Its Applications to Model Thunderstorms, American Journal of Mathematics and Statistics, 5 (6), 354-360.
[27]  Van Deusen, P.C. (1986): Fitting assumed distributions to horizontal point sample diameters, For. Sci., 32, 146-148.