American Journal of Mathematics and Statistics

p-ISSN: 2162-948X    e-ISSN: 2162-8475

2016;  6(4): 145-154

doi:10.5923/j.ajms.20160604.02

 

Size-Biased Poisson-Sujatha Distribution with Applications

Rama Shanker1, Hagos Fesshaye2

1Department of Statistics, Eritrea Institute of Technology, Asmara, Eritrea

2Department of Economics, College of Business and Economics, Halhale, Eritrea

Correspondence to: Rama Shanker, Department of Statistics, Eritrea Institute of Technology, Asmara, Eritrea.

Email:

Copyright © 2016 Scientific & Academic Publishing. All Rights Reserved.

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

A size-biased Poisson-Sujatha distribution (SBPSD) has been proposed by size-biasing the Poisson-Sujatha distribution (PSD) of Shanker (2016 b), a Poisson mixture of Sujatha distribution introduced by Shanker (2016 a). The first four moments about origin and moments about mean have been obtained and hence expressions for coefficient of variation (C.V.), skewness, kurtosis and index of dispersion have been given. The estimation of its parameter has been discussed using maximum likelihood estimation and method of moments. Three examples of real data-sets have been presented to test the goodness of fit of SBPSD over size-biased Poisson distribution (SBPD) and size-biased Poisson-Lindley distribution (SBPLD).

Keywords: Sujatha distribution, Poisson-Sujatha distribution, Size-Biasing, Moments, Estimation of parameter, Goodness of fit

Cite this paper: Rama Shanker, Hagos Fesshaye, Size-Biased Poisson-Sujatha Distribution with Applications, American Journal of Mathematics and Statistics, Vol. 6 No. 4, 2016, pp. 145-154. doi: 10.5923/j.ajms.20160604.02.

1. Introduction

Size-biased distributions are a particular case of weighted distributions which arise naturally in practice when observations from a sample are recorded with probability proportional to some measure of unit size. In field applications, size-biased distributions can arise either because individuals are sampled with unequal probability by design or because of unequal detection probability. Size-biased distributions come into play when organisms occur in groups, and group size influences the probability of detection. Fisher (1934) firstly introduced these distributions to model ascertainment biases which were later formalized by Rao (1965) in a unifying theory for problems where the observations fall in non-experimental, non-replicated and non-random categories. Size-biased distributions have applications in environmental science, econometrics, social science, biomedical science, human demography, ecology, geology, forestry etc. Van Duesen (1986) has detailed study about the applications of size-biased distributions for fitting distributions of diameter at breast height (DBH) data arising from horizontal point sampling (HPS). Later, Lappi and Bailey (1987) have applied size-biased distributions to analyze HPS diameter increment data. The applications of size-biased distributions to the analysis of data relating to human population and ecology can be found in Patil and Rao (1977, 1978). A number of research have been done relating to size-biased distributions and their applications in different fields of knowledge by different researchers including Scheaffer (1972), Patil and Ord (1976), Singh and Maddala (1976), Patil (1981), McDonald (1984), Gove (2000, 2003), Correa and Wolfson (2007), Drummer and McDonald (1987), Ducey (2009), Alavi and Chinipardaz (2009), Mir and Ahmad (2009), Ducey and Gove (2015), are some among others.
Let a random variable has probability distribution . If sample units are weighted or selected with probability proportional to then the corresponding size-biased distribution of order is given by its probability mass function
where When the distribution is known as simple size-biased distribution and is applicable for size-biased sampling and for the distribution is known as area-biased distribution and is applicable for area-biased sampling.
The Poisson-Sujatha distribution (PSD) having probability mass function (p.m.f.)
(1.1)
has been introduced by Shanker (2016 b) for modeling count data in various fields of knowledge. Its various properties, estimation of parameter and applications has been discussed in detail by Shanker (2016 b) and shown that it is better than Poisson and Poisson-Lindley distributions. Shanker and Hagos (2016) has detailed study about the applications of PSD for modeling count data-sets from ecology, genetics and other areas of biological sciences and concluded that in most of the data-sets it gives better fit that Poisson and Poisson-Lindley distributions.
The PSD arises from the Poisson distribution when its parameter follows Sujatha distribution introduced by Shanker (2016 a) with probability density function (p.d.f.)
(1.2)
The p.m.f. of the size-biased Poisson-Sujatha distribution (SBPSD) with parameter can be obtained as
(1.3)
where is the mean of the PSD with p.m.f. (1.1).
The p.m.f. of SBPSD can also be obtained from the size-biased Poisson distribution (SPBD) with p.m.f.
(1.4)
when its parameter follows size-biased Sujatha distribution (SBSD) with p.d.f.
(1.5)
We have
(1.6)
which is the p.m.f. of SBPSD.
It would be recalled that the p.m.f. of size-biased Poisson-Lindley distribution (SBPLD) given by
(1.7)
has been introduced by Ghitany and Mutairi (2008), which is a size-biased version of Poisson-Lindley distribution introduced by Sankaran (1970). Ghitany and Mutairi (2008) have discussed its various mathematical and statistical properties, estimation of the parameter using maximum likelihood estimation and the method of moments, and goodness of fit. Shanker et al (2015) has detailed study on the applications of size-biased Poisson-Lindley distribution (SBPLD) for modeling data on thunderstorms and observed that in most data – sets, SBPLD gives better fit than size-biased Poisson distribution (SBPD).
The graphs of the probability mass functions of SBPSD and SBPLD for selected values of their parameter are shown in figure 1.
Figure 1. Graphs of pmf of SBPSD and SBPLD for selected values of the parameter θ

2. Moments

Using (1.6), the factorial moment about origin of the SBPSD (1.3) can be obtained as
Taking we get
Using gamma integral and a little algebraic simplification, the factorial moment about origin of SBPSD (1.3) can be obtained as
(2.1)
Taking in (2.1), the first four factorial moments about origin can be obtained and using the relationship between moments about origin and factorial moments about origin, the first four moments about origin of the SBPSD (1.3) are thus obtained as
Now, using the relationship between moments about mean and the moments about origin, the moments about mean of the SBPSD (1.3) are thus obtained as
The coefficient of variation coefficient of Skewness coefficient of Kurtosis and index of dispersion of the SBPSD (1.3) are thus obtained as
It can be easily verified that SBPSD is over-dispersed equi-dispersed and under-dispersed for It should be noted that SBPLD is over-dispersed equi-dispersed and under-dispersed for
To study the characteristics of of SBPSD and SBPLD for various values of the parameter the numerical values of these characteristics have been presented in table 1.
Table 1. Characteristics of
      for SBPSD and SBPLD for selected values of the parameter θ
     
The graphs of coefficient of variation (C.V), coefficient of Skewness coefficient of Kurtosis and index of dispersion of SBPSD and SBPLD are shown in figure 2
Figure 2. Graphs of coefficient of variation (C.V), coefficient of Skewness coefficient of Kurtosis and index of dispersion for SBPSD and SBPLD for selected values of their parameter θ

3. Properties of SBPSD

3.1. Unimodality and Increasing Failure Rate

Since
is a deceasing function of is log-concave. Therefore, SBPSD is unimodal, has an increasing failure rate (IFR), and hence increasing failure rate average (IFRA). It is new better than used in expectation (NBUE) and has decreasing mean residual life (DMRL). The definitions, concepts and interrelationship between these aging concepts have been discussed in Barlow and Proschan (1981).

3.2. Generating Function

Probability Generating Function: The probability generating function of the SBPSD (1.3) can be obtained as
Moment Generating Function: The moment generating function of the SBPSD (1.3) is thus given by

4. Estimation of the Parameter

4.1. Maximum Likelihood Estimate (MLE)

Let be a random sample of size from the SBPSD (1.3) and let be the observed frequency in the sample corresponding to such that where is the largest observed value having non-zero frequency. The likelihood function of the SBPSD (1.3) is given by
The log likelihood function can be obtained as
The first derivative of the log likelihood function is thus given by
where is the sample mean.
The maximum likelihood estimate (MLE), of SBPSD (1.3) is the solution of the equation and is given by the solution of the following non-linear equation
(4.1.1)
Table 2. Distribution of number of counts of sites with particles from Immunogold data
     
Table 3. Distribution of snowshoe hares captured over 7 days
     
Table 4. Number of counts of pairs of running shoes owned by 60 members of an athletics club, reported by Simonoff (2003, p. 100)
     
The non-linear equation (4.1.1) can be solved by any numerical iteration methods such as Newton- Raphson method, Bisection method, Regula –Falsi method etc. In the present paper, Newton-Raphson method has been used to solve the above non-linear equation to find maximum MLE of the parameter.

4.2. Method of Moment Estimate (MOME)

Equating the population mean to the corresponding sample mean, the method of moment estimate (MOME) of of SBPSD (1.3) is the solution of the following cubic equation in
where is the sample mean.

5. Goodness of Fit

In this section the goodness of fit of SBPSD, SBPLD and SBPD has been presented for three count data- sets. The fitting of these distributions are based on maximum likelihood estimates of the parameter. The first data-set is immunogold assay data of Cullen et al. (1990) regarding the distribution of number of counts of sites with particles from immunogold assay data, the second data-set is animal abundance data of Keith and Meslow (1968) regarding the distribution of snowshoe hares captured over 7 days, and the third data-set is number of counts of pairs of running shoes owned by 60 members of an athletics club, reported by Simonoff (2003).

6. Concluding Remarks

A size-biased Poisson mixture of size-biased Sujatha distribution named, “size-biased Poisson-Sujatha distribution (SBPSD)” has been proposed by size-biasing the Poisson-Sujatha distribution (PSD) of Shanker (2016 b), a Poisson mixture of Sujatha distribution introduced by Shanker (2016 a). Its moments and other distributional properties including moments, coefficient of variation, skewness, kurtosis and index of dispersion have been studied. The estimation of its parameter has been discussed using maximum likelihood estimation and method of moments. Three examples of real data-sets have been presented to test the goodness of fit of SBPSD over size-biased Poisson distribution (SBPD) and size-biased Poisson-Lindley distribution (SBPLD).

ACKNOWLEDGEMENTS

The authors would like to thank the Editor-In-Chief and the referee for careful reading and for their comments which improved the quality of the paper.

References

[1]  Alavi, S.M.R. and Chinipardaz, R. (2009): Form-invariance under weighted sampling, Statistics, 43, 81 – 90.
[2]  Barlow, R.E. and Proschan, F. (1981): Statistical Theory of Reliability and Life Testing, Silver Spring, MD.
[3]  Correa, J.A. and Wolfson, D.B. (2007): Length-bias: some Characterizations and applications, Journal of Statistical Computation and Simulation, 64, 209 – 219.
[4]  Drummer, T.D. and MacDonald, L.L. (1987): Size biased in line transect sampling, Biometrics, 43, 13 – 21.
[5]  Cullen, M.J., Walsh, J., Nicholson, L.V., and Harris, J.B. (1990): Ultrastructural localization of dystrophin in human muscle by using gold immunolabelling, Proceedings of the Royal Society of London, 20, 197 - 210.
[6]  Ducey, M.J. (2009): Sampling trees with probability nearly proportional to biomass, For. Ecol. Manage., 258, 2110 - 2116
[7]  Ducey, M.J. and Gove, J.H. (2015): Size-biased distributions in the generalized beta distribution family, with applications to forestry, Forestry- An International Journal of Forest Research. 88, 143 – 151.
[8]  Fisher, R.A. (1934): The effects of methods of ascertainment upon the estimation of frequencies, Ann. Eugenics, 6, 13 – 25.
[9]  Ghitany, M.E. and Al-Mutairi, D.K. (2008): Size-biased Poisson-Lindley distribution and Its Applications, Metron - International Journal of Statistics, LXVI (3), 299 – 311.
[10]  Gove, J.H. (2000): Some observations on fitting assumed diameter distributions to horizontal point sampling data, Can. J. For. Res., 30, 521 – 533.
[11]  Gove, J.H. (2003): Estimation and applications of size-biased distributions in forestry. In Modeling Forest Systems. A Amaro, D. Reed and P. Soares (Eds), CABI Publishing, pp. 201 – 212.
[12]  Keith, L.B. and Meslow, E.C. (1968): Trap Response by snowshoe hares, Journal of Wildlife Management, 20, 795- 801.
[13]  Lappi, J. and bailey, R.L. (1987): Estimation of diameter increment function or other tree relations using angle-count samples, Forest science, 33, 725 – 739.
[14]  Lindley, D.V. (1958): Fiducial distributions and Bayes theorem, Journal of the Royal Statistical Society, 20 (1), 102 – 107.
[15]  MacDonald, J.B. (1984): Some generalized functions for the size distribution of income, Econometrics, 52, 647 – 664.
[16]  Mir, K.H. and Ahmad, M. (2009): Size-biased distributions and Their Applications, Pakistan Journal of Statistics, 25 93), 283 – 294.
[17]  Patil, G.P. (1981): Studies in statistical ecology involving weighted distributions. In Applications and New Directions, J.K. Ghosh and J. Roy (eds). Proceeding of Indian Statistical Institute. Golden Jubliee, Statistical Publishing society, pp. 478 – 503.
[18]  Patil, G.P. and Ord, J.K. (1976): On size-biased sampling and related form-invariant weighted distributions, Sankhya Ser. B, 38, 48 – 61.
[19]  Patil, G.P. and Rao, C.R. (1977): The Weighted distributions: A survey and their applications. In applications of Statistics (Ed P.R. Krishnaiah0, 383 – 405, North Holland Publications Co., Amsterdam.
[20]  Patil, G.P. and Rao, C.R. (1978): Weighted distributions and size-biased sampling with applications to wild-life populations and human families, Biometrics, 34, 179 - 189
[21]  Rao, C.R. (1965): On discrete distributions arising out of methods of ascertainment In: Patil, G.P. (eds) Classical and Contagious Discrete Distributions. Statistical Publishing Society, Calcutta, 320 – 332.
[22]  Sankaran, M. (1970): The discrete Poisson-Lindley distribution, Biometrics, 26, 145- 149.
[23]  Scheaffer, R.L. (1972): Size-biased sampling, Technometrics, 14, 635 – 644.
[24]  Shanker, R. (2016 a): Sujatha distribution and Its Applications, Accepted for publication in “Statistics in Transition-new Series”, 17 (3).
[25]  Shanker, R. (2016 b): The discrete Poisson-Sujatha distribution, International Journal of Probability and Statistics, 5(1), 1 – 9.
[26]  Shanker, R., Hagos, F. and Abrehe, Y. (2015): On Size –Biased Poisson-Lindley Distribution and Its Applications to Model Thunderstorms, American Journal of Mathematics and Statistics, 5 (6), 354 – 360.
[27]  Shanker, R. and Hagos, F. (2016): On Poisson-Sujatha distribution and its Applications to Model count data from Biological Science, Biometrics and Biostatistics International Journal, 3(4), 1 – 7.
[28]  Singh, S.K. and Maddala, G.S. (1976): A function for the size distribution of incomes, Econometrica, 44, 963 – 970.
[29]  Lindley, D.V. (1958): Fiducial distributions and Bayes theorem, Journal of the Royal Statistical Society, 20 (1), 102- 107.
[30]  Simmonoff, J.S. (2003): Analyzing Categorical data, Springer, New york.
[31]  Van Deusen, P.C. (1986): Fitting assumed distributions to horizontal point sample diameters, For. Sci., 32, 146 -148.