Bayesian Estimation and Inference for the             Generalized Partial Linear Model

Haitham M. Yousof; Ahmed M. Gad

Paper Information
Paper Submission

International Journal of Probability and Statistics

p-ISSN: 2168-4871 e-ISSN: 2168-4863

2015; 4(2): 51-64

doi:10.5923/j.ijps.20150402.03

Bayesian Estimation and Inference for the Generalized Partial Linear Model

Abstract
Reference
Full-Text PDF
Full-text HTML

Haitham M. Yousof¹, Ahmed M. Gad²

¹Department of Statistics, Mathematics and Insurance, Benha University, Egypt

²Department of Statistics, Cairo University, Egypt

Correspondence to: Ahmed M. Gad, Department of Statistics, Cairo University, Egypt.

Email:

Abstract

In this article we propose a Bayesian regression model called the Bayesian generalized partial linear model which extends the generalized partial linear model. We consider Bayesian estimation and inference of parameters for the generalized partial linear model (GPLM) using some multivariate conjugate prior distributions under the square error loss function. We propose an algorithm for estimating the GPLM parameters using Bayesian theorem in more detail. Finally, comparisons are made between the GPLM estimators using Bayesian approach and the classical approach via a simulation study.

Keywords: Generalized Partial Linear Model, Profile Likelihood Method, Generalized Speckman Method, Back-Fitting Method, Bayesian Estimation

Cite this paper: Haitham M. Yousof, Ahmed M. Gad, Bayesian Estimation and Inference for the Generalized Partial Linear Model, International Journal of Probability and Statistics , Vol. 4 No. 2, 2015, pp. 51-64. doi: 10.5923/j.ijps.20150402.03.

Article Outline

1. Introduction

2. Generalized Partial Linear Model (GPLM)

2.1. Profile Likelihood Method

2.2. Generalized Speckman Method

2.3. Back-fitting Method

2.4. Some Statistical Properties of the GPLM Estimators.

3. Bayesian Estimation and Inference for the GPLM

3.1. A proposed Algorithm for Estimating the GPLM Parameters

3.2. Bayesian approach for Estimating the GPLM Using Bayesian Estimator

3.2.1.

3.2.2.

3.2.3.

1. Introduction

The semi-parametric regression models are intermediate step between the fully parametric and nonparametric models. Many definitions of semi-parametric models are available in literature. The definition that will be adopted in this article is that the model is a semi-parametric model if it contains a nonparametric component in addition to a parametric component and they need to be estimated. The semi-parametric models are characterized by a finite-dimensional component,

and an infinite-dimensional

Semi-parametric models try to combine the flexibility of a nonparametric model with the advantages of a parametric model. A fully nonparametric model will be more robust than semi-parametric and parametric models since it does not suffer from the risk of misspecification. On the other hand, nonparametric estimators suffer from low convergence rates, which deteriorate when considering higher order derivatives and multidimensional random variables. In contrast, the parametric model carries a risk of misspecification but if it is correctly specified it will normally enjoy

with no deterioration caused by derivatives and multivariate data. The basic idea of a semi-parametric model is to take the best of both models. The semi parametric generalized linear model known as the generalized partial linear model (GPLM) is one of the semi-parametric regression models, See Powell (1994); Rupport et al. (2003); and Sperlich et al. (2006).

Many authors have tried to introduce new algorithms for estimating the semi-parametric regression models. Meyer et al. (2011) have introduced Bayesian estimation and inference for generalized partial linear models using shape-restricted splines. Zhang et al. (2014) have studied estimation and variable selection in partial linear single index models with error-prone linear covariates. Guo et al. (2015) have studied the empirical likelihood for single index model with missing covariates at random. Bouaziz et al. (2015) have studied semi-parametric inference for the recurrent events process by means of a single-index model.

The curse of dimensionality problem (COD) associated with nonparametric density and conditional mean function makes the nonparametric methods impractical in applications with many regressors and modest size samples. This problem limits the ability to examine data in a very flexible way for higher dimensional problems. As a result, the need for other methods became important. It is shown that semi parametric regression models can be of substantial value in solution of such complex problems.

Bayesian methods provide a joint posterior distribution for the parameters and hence allow for inference through various sampling methods. A number of methods for Bayesian monotone regression have been developed. Ramgopal et al (1993) introduced a Bayesian monotone regression approach using Dirichlet process priors. Perron and Mengersen (2001) proposed a mixture of triangular distributions where the dimension is estimated as part of the Bayesian analysis. Both Holmes and Heard (2003) and Wang (2008) model functions where the knot locations are free parameters with the former using a piecewise constant model and the latter imposing the monotone shape restriction using cubic splines and second-order cone programming with a truncated normal prior. Johnson (2007) estimates item response functions with free-knot regression splines restricted to be monotone by requiring spline coefficients to monotonically increasing. Neelon and Dunson (2004) proposed a piecewise linear model where the monotonicity is enforced via prior distributions; their model allows for flat spots in the regression function by using a prior that is a mixture of a continuous distribution and point mass at the origin. Bornkamp and Ickstadt (2009) applied their Bayesian monotonic regression model to dose-response curves. Lang and Brezger (2004) introduced Bayesian penalized splines for the additive regression model and Brezger and Steiner (2008) applied the Bayesian penalized splines model to monotone regression by imposing linear inequality constraints via truncated normal priors on the basis function coefficients to ensure monotonicity. Shively et al (2009) proposed two Bayesian approaches to monotone function estimation with one involving piecewise linear approximation and a Wiener process prior and the other involving regression spline estimation and a prior that is a mixture distribution of constrained normal distributions for the regression coefficients.

In this article, we propose a new method for estimating the GPLM based on Bayesian theorem using a new algorithm for estimation. The rest of the paper is organized as follows. In Section 2, we define the Generalized Partial Linear Model (GPLM). In Section 3, we present the Bayesian estimation and inference for the (GPLM). In Section 4, we provide Simulation Study. Finally, some concluding remarks and discussion are presented in Section 5.

2. Generalized Partial Linear Model (GPLM)

The GPLM model has the form

(1)

where G(.) is a known link function. It is a semi-parametric model since it contains both parametric and nonparametric components. This model can be reduced to the following model,

(2)

For a unit linear function. It is called partial linear model (PLM).

The GPLM is used by Severini and Staniswalis (1994); Chen (1995); Burda et al. (1998); Muller (2001); Peng (2003); and Peng and Wang (2004).

The estimation methods for the GPLM, in Eq. (1), are based on the idea that an estimator

can be found for known

and an estimator

can be found for known

The estimation methods that will be considered are based on kernel smoothing methods in the estimation of the nonparametric component of the model, therefore the following estimation methods are presented in sequel.

2.1. Profile Likelihood Method

The profile likelihood method introduced by Severini and Wong (1992). It is based on assuming a parametric model for the conditional distribution of Y given X and W. The idea of this method is as follows:

First: Assume the parametric component of the model, i.e., the parameters vector,

Second: Estimate the nonparametric component of the model which depends on this fixed

i.e.

by some type of smoothing method to obtain the estimator

Third: Use the estimator

to construct profile likelihood for the parametric component using either a true likelihood or quasi-likelihood function.

Fourth: The profile likelihood function is then used to obtain an estimator of the parametric component of the model using a maximum likelihood method.

Thus the profile likelihood method aims to separate the estimation problem into two parts, the parametric part which is estimated by a parametric method and the nonparametric part which is estimated by a nonparametric method.

Murphy and Vaart (2000) showed that the full likelihood method fails in semi-parametric models. In semi parametric models the observed information, if it exits, would be an infinite-dimensional operator. They used profile likelihood rather than a full likelihood to overcome the problem, the algorithm for profile likelihood method is derived as follows:

Derivation of different likelihood functions

For the parametric component of the model, the objective function is the parametric profile likelihood function which is maximized to obtain an estimator for

is given by

(3)

where

denotes the log-likelihood or quasi-likelihood function,

and

For the nonparametric component of the model, the objective function is a smoothed or a local likelihood function which is given by

(4)

where

and the local weight

is the kernel weight with

denoting a multidimensional kernel function and H is a bandwidth matrix. The function in Eq. (4) is maximized to obtain an estimator for the smooth function

at a point w.

Maximization of the likelihood functions

The maximization of the local likelihood in Eq. (4) requires solving

(5)

with respect to

Note that

denotes the first derivative of the likelihood function

The maximization of the profile likelihood in Eq. (3) requires solving

(6)

with respect to the coefficient vector

The vector

denotes the vector of all partial derivatives of

with respect to

A further differentiation of Eq. (5) with respect to p leads to an explicit expression for

as follows:

(7)

where

denotes the second derivative of

Equations (5) and (6) can only be solved iteratively. Severini and Saitniswalis (1994) presented a Newton-Raphson type algorithm for this problem as follows:

Let

Further let

and

will be the first and second derivatives of

and

with respect to their first argument. All values will be calculated at the observations

instead of the free parameter w. Then Equations (5) and (6) are transformed to

(8)

and

(9)

respectively.

The estimator of

based on Eq. (7) is necessary to estimate

(10)

Equations (8) and (10) imply the following iterative Newton-Raphson Type algorithm

First: Initial values

Different strategies are found for obtaining start values of the estimators:

• Using

and

obtained from fitting a parametric generalized linear model (GLM).

• Using

and

with the adjustment for Binomial responses as

• Using

and

with the adjustment for binomial responses as

. (See Severini and Staniswalis, 1994).

Second: The updating step for

where B is a Hessian type matrix defined as

and

The updating step for p can be summarized in a closed matrix form as follows:

where

The matrix X is the design matrix with rows

I is an

identity matrix,

(11)

(12)

and

is a smoother matrix with elements

(13)

Third: The updating step for

The function

is updated by

where k=0, l, 2,…is the number of iteration.

It is noted that the function

can be replaced by its expectation with respect to Y to obtain a Fisher scoring type algorithm (Severini and Staniswalis, 1994).

The previous procedure can be summarized as follows.

Updating step for

Notes on the procedure:

1. The variable

which is defined here, is a set of adjusted dependent variable.

2. The parameter

is updated by a parametric method with a nonparametrically modified design matrix

3. The function

can be replaced by its expectation, with respect to y, to obtain a Fisher scoring type procedure.

4. The updating step for

is of quite complex structure and can be simplified in some models for identity and exponential link functions G.

2.2. Generalized Speckman Method

Generalized Speckman estimation method back to Speckman (1988). In the case of identity link function G and normally distributed Y, the generalized Speckman and profile likelihood methods coincides with no updating steps are needed for the estimation of both

and m. This method can be summarized in the case of identity link (PLM) as follows:

(1) Estimate p by:

(2) Estimate m by:

where

and a smoother matrix S is defined by its elements as

This matrix is a simpler form of a smoother matrix, and differs from the one used in Eq. (13) where the matrix S yields

and

in the case of normally distributed Y.

For the GPLM, the Speckman estimator is combined with the IWLS method used in the estimation of GLM. As it is shown in IWLS each iteration step of GLM was obtained by WLS regression on the adjusted dependent variable. The same procedure will be used in the GPLM by replacing IWLS with a weighted partial linear fit on the adjusted dependent variable given by

where

and D are defined as in Equations (11) and (12) respectively. The generalized Speckman algorithm for the GPLM can be summarized as:

First: Initial values:

The initial values used in this method are the same as in the previous profile likelihood algorithm.

Second: Updating step for

Third: Updating step for m

Where

The smoother matrix is used with elements:

(14)

There is a difference between the smoother matrix in Equation (14) and that in Equation (13). In Equation (13) we use

instead of

that is used in Equation (14).

2.3. Back-fitting Method

Hastie and Tibishirani (1990) suggested the back-fitting method as an iterative algorithm to fit an additive model. The idea of this method is to regress the additive components separately on the partial residuals.

The back-fitting method will be presented in the case of identity link G (PLM) and non-monotone G (GPLM) as follows:

Back-fitting algorithm for the GPLM

Back-fitting for the GPLM is an extension to that of PLM. The iterations in this method coincide with that in the Speckman method. This method differs from the Speckman method only in the parametric part. The back-fitting algorithm for the GPLM can be summarized as follows

First: Initial values:

This method often use

Second: Updating step for

Third: Updating step for m

where the matrices D and S, the vector

are defined as in the Speckman method for the GPLM (See Muller, 2001).

In practice, some of the predictor variables are correlated. Therefore, Hastie and Tibshirani (1990) proposed a modified back-fitting method which first search for a parametric solution and only fit the remaining parts non-parametrically.

2.4. Some Statistical Properties of the GPLM Estimators.

(1) Statistical properties of the parametric component

Under some regularity conditions the estimator

has the following properties:

estimator for

2. Asymptotically normal.

3. Its limiting covariance has a consistent estimator.

4. Asymptotically efficient; has asymptotically minimum variance (Severini and Staniswalis (1994).

(2) Statistical properties of the non-parametric component m:

The non-parametric function m can be estimated (in the univariate case) with the usual univariate rate of convergence. Severini and Staniswalis (1994) showed that the estimator

is consistent in supremum norm. They showed that the parametric and non-parametric estimators have the following asymptotic properties:

where

are the true parameter values so that

3. Bayesian Estimation and Inference for the GPLM

Bayesian inference derives the posterior distribution as a consequence of two antecedents; a prior probability and a likelihood function derived from a probability model for the data to be observed. In Bayesian inference the posterior probability can be obtained according to Bayes theorem as follows

(15)

where

is the posterior distribution,

is the prior distribution and

is the likelihood function.

3.1. A proposed Algorithm for Estimating the GPLM Parameters

1. Obtain the probability distribution of response variable,

2. Obtain the likelihood function of the probability distribution of response variable

3. Choose a suitable prior distribution of

4. Use Eq. (15) to obtain the posterior distribution.

5. Obtain the Bayesian estimator under the square error loss function.

6. Replace the initial value of

by the Bayesian estimator.

7. Use the profile likelihood method, generalized Speckman method and Back-fitting method with the new initial value of

to estimate the GPLM parameters.

3.2. Bayesian approach for Estimating the GPLM Using Bayesian Estimator

3.2.1. is assumed known

Case 1

Consider the GPLM in Eq. (1) and suppose that

belongs to the multivariate normal distribution with pdf

Then the likelihood function of the variable

can be written as

(16)

where

Let

Then

(17)

Combining (16) and (17) and using (15), we obtain the following posterior distribution (using some algebraic steps)

which is multivariate normal distribution with

Then, the Bayesian estimator under the square error loss function

(18)

Case 2

Consider the GPLM in Eq. (1) and suppose that the pdf of

as follows

Then the likelihood function of the probability distribution of

Let

Then, we can rewrite the likelihood function of as follows

(19)

Let

multivariate normal distribution

Then

(20)

where

Combining (19) and (20) and using (15), we obtain the following posterior distribution (using some algebraic steps)

where

which is multivariate normal distribution with

and

Then, the Bayesian estimator under the square error loss function is

(21)

3.2.2. is assumed unknown

Case 1

Consider the GPLM in Eq. (1) and suppose that the marginal posterior distribution for

is proportional with

(22)

Hence

where

Subsequently

where

any positive number, then we can rewrite the last expression as follows

(23)

Let

Then

(24)

Substituting form (24) and (25) into (23) we get

Therefore

Then, the posterior distribution

and the marginal posterior distribution for

(25)

Case 2

Consider the GPLM in Eq. (1) and suppose that

Then the likelihood function of the pdf of response variable

(26)

where

Let

(27)

and

(28)

Then from (27) and (28) we get

(29)

Combining (26) and (29), using (15) and after some algebraic steps, we obtain the following joint posterior distribution

(30)

where

Then

which is normal inverse gamma distribution.

From (30), the marginal posterior distribution for

(31)

From (30), the marginal posterior distribution for

(32)

which is multivariate t-distribution with the following pdf

with expected value

where

Then, the Bayesian estimator under the square error loss function

(33)

3.2.3. is assumed known

Case 1

Consider the GPLM in Eq. (1) with likelihood function of the pdf of response variable

(34)

Let

Then

(35)

Combining (34) and (35) Using (15) and after some algebra, we obtain the following posterior distribution

(36)

Case 2

Consider the GPLM in Eq. (1) with likelihood function of the pdf of response variable

(37)

where

Let

(38)

Combining (37) and (38) Using (15) and after some algebra, we obtain the following posterior distribution

(39)

which is inverse Wishert distribution where

with expected value

Then, the Bayesian estimator under the square error loss function

The previous results are summarized in Table (1).

Table (1). The Posterior Distribution Functions

4. Simulation Study

The aim of this simulation study is twofold. First, is to evaluate the performance of the proposed Bayesian estimators. Second, is to compare the proposed technique with classical approaches. The simulation study is based on 500 Monte Carlo replications. Different sample sizes have been used ranging from small, moderate, to large sizes. In specific sample sizes are fixed at n=50, n=100, n=200, n=500, n=100, and n=2000. Also, different bandwidth parameters are used, namely, h=2, h=1, h=0.5, and h=0.2.

The estimation results and statistical analysis are obtained using statistical Package XploRe, 4.8 (XploRe, 2000).

The methods are compared according to the following criteria.

First, the Average Mean of Squared Errors for

where

The second is the Average Mean of Squared Errors for

where

The third is the deviance where,

Deviance = -2 Log Likelihood.

The results are shown in the following Tables (2) – (7).

Table (2). Simulation Results for n = 50

Table (3). Simulation Results for n = 100

Table (4). Simulation Results for n = 200

Table (5). Simulation Results for n = 500

Table (6). Simulation Results for n = 1000

Table (7). Simulation Results for n = 2000

From the results we can conclude that

1. The Bayesian estimation for the GPLM outperforms the classical estimation for all sample sizes under the square error loss function. It gives high efficient estimators with smallest

and Mean of Deviance.

2. The Bayesian estimation for the GPLM using the back-fitting method outperforms the profile likelihood method and the generalized Speckman method for n=50 and n=100.

3. The Bayesian estimation for GPLM using the profile likelihood method and the generalized Speckman method outperforms the back-fitting method for n=1000 and n=2000.

5. Discussions

In this article, we introduce a new Bayesian regression model called the Bayesian generalized partial linear model which extends the generalized partial linear model (GPLM). Bayesian estimation and inference of parameters for (GPLM) have been considered using some multivariate conjugate prior distributions under the square error loss function.

Simulation study is conducted to evaluate the performance of the proposed Bayesian estimators. Also, the simulation study is used to compare the proposed technique with classical approaches. The simulation study is based on 500 Monte Carlo replications. Different sample sizes have been used ranging from small, moderate, to large sizes. In specific sample sizes are fixed at n=50, n=100, n=200, n=500, n=100, and n=2000. Also, different bandwidth parameters are used, namely, h=2, h=1, h=0.5, and h=0.2.

From the results of the simulation study, we can conclude the following. First, the Bayesian estimation for the GPLM outperforms the classical estimation for all sample sizes under the square error loss function. It gives high efficient estimators with smallest AMSE

AMSE

and Mean of Deviance. Second, the Bayesian estimation for the GPLM using the back-fitting method outperforms the profile likelihood method and the generalized Speckman method for n=50 and n=100. The Bayesian estimation for GPLM using the profile likelihood method and the generalized Speckman method outperforms the back-fitting method for n=1000 and n=2000. Finally, The Bayesian estimation of parameters for the GPLM gives small values of AMSE

AMSE

comparable to the classical estimation for all sample sizes.

References

[1]	Bornkamp, B., and Ickstadt, K. (2009) Bayesian nonparametric estimation of continuous monotone functions with applications to dose–response analysis, Biometrics, 65, 198–205.
[2]	Bouaziz, O., Geffray, S., and Lopez, O. (2015) Semiparametric inference for the recurrent events process by means of a single-index model, Journal of Theoretical and Applied Statistics, 49 (2), 361–385.
[3]	Burda, M., Hardel, W., Müller, M., and Weratz, A. (1994) Semiparamrtric analysis of German east-west migration: facts and theory, Journal of Applied Econometrics, 13 (5), 525-542.
[4]	Guo, X., Niu, G., Yang, Y., and Xu,W. (2015) Empirical likelihood for single index model with missing covariates at random, 49 (3), 588–601.
[5]	Holmes, C.C., and Heard, N. A. (2003) Generalized monotonic regression using random change points, Statistics in Medicine, 22, 623–638.
[6]	Johnson, M.S. (2007) Modeling dichotomous item responses with free-knot splines, Journal of Computational Statistics& Data Analysis, 51, 4178–4192.
[7]	Lang, S., and Brezger,A. (2004), Bayesian P-splines, Journal of Computational and Graphical Statistics, 13, 183–212.
[8]	Meyer, M.C., Hackstadt, A. J., and Hoeting, J. A. (2011) Bayesian estimation and inference for generalized partial linear models using shape-restricted splines, Journal of Nonparametric Statistics, 23, 867-884.
[9]	Müller, M. (2001) Estimation and testing in generalized partial linear models: a comparative study, Statistics and Computing, 11, 299-309.
[10]	Murphy, S.A., and Van Der Vaart, A.W. (2000) On profile likelihood, Journal of the American Statistical Association, 95, 449-485.
[11]	Neelon, B., and Dunson, D.B. (2004) Bayesian isotonic regression and trend analysis, Biometrics, 60, 498–406.
[12]	Peng, H. (2003) Efficient estimation in semi- parametric generalized linear models, Proceedings of JSM 2003.
[13]	Peng, H., and Wang, X., (2004) Moment estimation in semi-parametric generalized linear models, Nonparametric Statistics, 1-22.
[14]	Perron, F., and Mengersen, K. (2001) Bayesian nonparametric modeling using mixtures of triangular distributions, Biometrics, 57, 518–528.
[15]	Powel, L. (1994) Estimation of semi-parametric regression models, Handbook of Econometrics, 4, Ch. 41.
[16]	Brezger, A., and Steiner,W.J. (2008) Monotonic regression based on Bayesian P-splines: an application to estimating price response functions from store-level scanner data , Journal of Business & Economic Statistics, 26, 91–104.
[17]	Ramgopal, P., Laud, P.W., and Smith, A.F.M. (1993) Nonparametric Bayesian bioassay with prior constraints on the shape of the potency curve, Biometrika, 80, 489–498.
[18]	Ruppert, D., Wang, M.P., and Caroll, R.J., (2003) Semi-parametric regression, Cambridge University Press, London.
[19]	Severini, T.A., and Staniswalis, J.G., (1994) Quasi-likelihood estimation in semi-parametric models, Journal of the American Statistical Association, 89, 501-511.
[20]	Severini, T.A, and Wong, W.H., (1992) Profile likelihood and conditionally parametric models, The Annals of Statistics, 20, 1768-1802.
[21]	Shively, T.S., Sager, T.W., and Walker, S.G. (2009) A Bayesian approach to non-parametric monotone function estimation, Journal of the Royal Statistical Society B, 71, 159–175.
[22]	Sperlich, S., Hardle, W., and Aydinli, G., (2006) The art of semi-parametric, Physical-Verlag, Germany.
[23]	Wang, X. (2008) Bayesian free-knot monotone cubic spline regression, Journal of Computational and Graphical Statistics, 17, 373–387.
[24]	XploRe (2000), "XploRe-the interactive statistical computing environment" WWW: http://www.xplore-stat.de.
[25]	Zhang, J., Wang, X., Yu, Y., and Gai, Y., (2010) Estimation and variable selection in partial linear single index models with error-prone linear covariates, Journal of Theoretical and Applied Statistics, 48 (5), 1048–1070.

Paper Information

Journal Information

Bayesian Estimation and Inference for the Generalized Partial Linear Model

Article Outline

1. Introduction

2. Generalized Partial Linear Model (GPLM)

2.1. Profile Likelihood Method

2.2. Generalized Speckman Method

2.3. Back-fitting Method

2.4. Some Statistical Properties of the GPLM Estimators.

3. Bayesian Estimation and Inference for the GPLM

3.1. A proposed Algorithm for Estimating the GPLM Parameters

3.2. Bayesian approach for Estimating the GPLM Using Bayesian Estimator

3.2.1. is assumed known

3.2.2. is assumed unknown

3.2.3. is assumed known

4. Simulation Study

5. Discussions

References