International Journal of Probability and Statistics

p-ISSN: 2168-4871    e-ISSN: 2168-4863

2021;  10(2): 27-45

doi:10.5923/j.ijps.20211002.01

Received: May 8, 2021; Accepted: Jun. 2, 2021; Published: Jun. 15, 2021

 

Berry-Esseen Type Bound in Partially Linear Regression Model under Mixing Sequences

Sallieu Kabay Samura

Department of Mathematics and Statistics, Fourah Bay College, University of Sierra Leone, Sierra Leone

Correspondence to: Sallieu Kabay Samura , Department of Mathematics and Statistics, Fourah Bay College, University of Sierra Leone, Sierra Leone.

Email:

Copyright © 2021 The Author(s). Published by Scientific & Academic Publishing.

This work is licensed under the Creative Commons Attribution International License (CC BY).
http://creativecommons.org/licenses/by/4.0/

Abstract

It is well-known that the confidence intervals of and in partially linear regression model lie in the limit distributions of their estimators. However the accuracy of the confidence intervals depends on how fast the theoretical distributions of the estimators converge to their limits. As a results, Berry-Esseen type bounds can be used to assess the accuracy. The aim of this paper is to study the Barry-Esseen type bounds for the estimators of and in the partially linear regression model with satisfying with and being stationary -mixing random variables. By choosing suitable weighted functions, the Berry-Esseen type bounds for the estimators and can achieve and respectively. Simulation studies are conducted to demonstrate the performance of the proposed procedure.

Keywords: Partially linear model, Berry-esseen bound, -mixing sequence

Cite this paper: Sallieu Kabay Samura , Berry-Esseen Type Bound in Partially Linear Regression Model under Mixing Sequences, International Journal of Probability and Statistics , Vol. 10 No. 2, 2021, pp. 27-45. doi: 10.5923/j.ijps.20211002.01.

1. Introduction

Partially linear regression model is a combination of linear and nonparametric parts in which the relationship between the response and some explanatory variables are linear whereas the other predictors are emerged in the model in unspecified association form. Opsomer and Ruppert (1999) argued for the advantage of partially linear regression model, including that there is less worry of overfitting, that they are more easily interpretable, and that the estimator is more efficient for the parametric components. Also, various estimation and variable selection methods for the partially linear regression model have been developed which we refer to, Horowitz (2009), Liu et al. (2011), Roozbeh et al. (2012), Amini and Roozbeh (2016), Roozbeh and Arashi (2016), Roozbeh (2018) and Amini and Roozbeh (2019) to mention a few.
Since its introduction by Engle et al. (1986), partially linear regression models have been studied by many authors. For example, Heckman (1986), Rice (1986), Chen (1988) and Speckman (1988) studied the consistency properties of the estimator of under different assumptions. Schick (1996) and Liang and Härdle (1997) extended the root n consistency and asymptotic results for the case of heteroscedasticity. Härdle et al. (2000) provided a good comprehensive reference of the partially linear model. Chen et al. (1998) and Gao et al. (1994) established the strong consistency and asymptotic normality, respectively, for the least squares estimators and weighted least-squares estimator (WLSE, for short) of based on nonparametric estimates and You et al. (2007) further studied the model and developed an inferential procedure which includes a test of heteroscedasticity, a two-step estimator of , mean square errors of and and a bootstrap goodness of fit test. If then the model boils down to the heteroscedastic linear model, whose asymptotic properties of the WLSE of were studied by Carroll (1982), Robinson (1987) and Carroll and Härdle (1989), respectively.
In this paper, we will further study the limit behaviors of the estimators in the partially linear regression model under -mixing random variables, the concept of which was first introduced by Bradley (1985) as follows.
Let be a sequence of random variables defined on a fixed probability space . Denote and . Let and be positive integers. Write . Given -algebras and in , let
where and . Define the -mixing coefficients by
Definition 1.1. A sequence of random variable is said to be -mixing if as
Since the concept of -mixing was introduced by Bradley (1985), many limit theorems were established. Bradley (1985) discussed central limit theorems under absolute regularity for -mixing sequences. Shao (1993) established limit theorems of -mixing sequences; Cai (1991) obtained strong consistency and rates for recursive nonparametric conditional probability density estimators under -mixing conditions; Lu and Lin (1997) gave the bounds of covariance of -mixing sequences; Shen and Zhang (2011) studied some convergence theorems for -mixing random variables, and obtained some new strong laws of large numbers for weighted sums of -mixing random variables; Gao (2016) investigated the -mixing sequences which are stochastically dominated, and presented some strong stability; Yu (2016) showed the Resenthal-type inequality of the -mixing sequences, and investigated the strong convergence theorems.
The aim of this paper is to further study the Barry-Esseen type bounds for the estimators of and in the partially linear regression model (2.1) with satisfying with and being stationary -mixing random variables. By choosing suitable weighted functions, the Berry-Esseen type bounds for the estimators and can achieve and respectively.
This work is organised as follows: In Section 2, we recall the partially linear regression model and construct the partial least squares estimator for both the parametric and non-parametric components. The main results and numerical analysis (simulations and real data) are presented in Section 3. The proofs of the main results are provided in Section 4.
Throughout this paper, the symbols denote positive constants whose values may be different in different places. Let and be the indicator function of the set . Let

2. Model and Estimation

Consider the following partially linear regression model:
(2.1)
where is an unknown parameter of interest, are nonrandom design points, are the response variables, are random errors, is an unknown functions defined on closed interval [0,1].
If is the true parameter, then model (2.1) is reduced to a nonparametric regression model Since we have . Using the least squares method, we obtain of by minimizing
The minimizer is found to be
(2.2)
where , and Then based on we defined the nonparametric function by
(2.3)
To derive the Berry-Esseen bounds for the estimators, we make the following assumptions:
A1. There exist a functions on [0,1] such that and
(i)
(ii)
A2. and are defined on [0,1] and satisfy Lipschitz condition of order 1.
A3. The probability weight function are defined on [0,1] and satisfy
A4. for some
A5. There exist positive integers and such that
A6. The spectral density of satisfies that
Remark 2.1 Conditions (A1)-(A3) have been used frequently by many authors. For example, Gao et al. [1996], Sun et al. [2002], You et al. [2005], Liang et al. [2006], You and Chen [2007] and so on. (A4) is adopted in Sun et al. [2002], You et al. [2005], Liang et al. [2006], Liang and Fan [2009] and so forth. Moreover, if functions and satisfy a Lipschitz condition of order 1 on [0,1], then (A3) (iii) implies that and

3. Main Results and Numerical Analysis

3.1. Main Results

In this subsection, we present the Berry-Esseen type bounds for the estimators , . We first introduce some notations which will be used in the theorem below.
Theorem 3.1. Let be a mean zero -mixing sequence, with , where and Suppose that conditions are satisfied, and for some . Let for some Assume that . Then, for and any we have
(3.1)
Corollary 3.1 Suppose that (A1)-(A4) hold with and where and for some . Let , for some and for some , Then
Theorem 3.2. Suppose that the conditions in theorem 3.1 hold. Let and If and converge to zero, then
(3.2)
Corollary 3.2 Set for some and Suppose that hold with for each and for some Let for some and for some , then

3.2. Numerical Simulation

In this subsection, We carry out a simulation to study the asymptotic normality of the estimators and , respectively. The observations are generated for the following model:
where is an AR(1) type process with be an MA(1) process specified by and , for where and for Here, we choose the nearest neighbor weights to be the weight functions For any we rewrite as follows
if , then is permutated before when . Take and defined the nearest neighbor weight function as follows:
We generate the observed data with sample size as and respectively. We used R software to compute and and obtained the Q-Q plots of and respectively, based on 500 replications.
Figure 1. Q-Q plot of with =50, 100 and 150, respectively
Figure 2. Q-Q plot of with =50, 100 and 150, respectively

4. Proofs of the Main Results

We first introduce several lemmas which will be used to prove the main results of the paper.
Lemma 3.1. (cf. Yu, 2016) Let be a sequence of -mixing random variables with , for some and , where and Assume that is an array of real numbers. Then there exists a positive constant depending only on and such that
Lemma 3.2. (Liang and Fan, 2009) Let be random variables. For positive numbers we have that
Lemma 3.3. (Lu and Lin (1997)) Let be a sequence of -mixing random variables. Suppose that and where and Then
Lemma 3.4. Let be a sequence of -mixing random variables. Suppose that p and q are two positive integers. Let for Then for any
Proof. It is easily checked that
(4.1)
Noting that we have
(4.2)
It follows from Lemma 3.3 and that
(4.3)
and
(4.4)
Noting that and hence applying Lemma 3.3 and invoking again the inequality we find that
(4.5)
and
(4.6)
By (3.1)-(3.6), we have
Proceeding in this manner, we obtain
This completes the proof of the lemma.
Lemma 3.5. Let be an array of real numbers such that and where are some positive numbers. Suppose that for some and for some Then for any
and
where are positive numbers and for some and any
Proof. We only prove the first inequality, and the second one is completely analogous. According to the definition of we have that
By and Markov's inequality, we have
Note that
Applying Lemma 3.1 with Noting that and implies
we have
Where the inequality in the last line above follows from and changing the order of simulation. The proof is completed.
Lemma 3.6. Under the assumptions of Theorem 2.1, the following statements hold.
Proof. (i) Applying Lemma 3.2 with and inequality, we have by for (A1) (iii) and (1.2) that
(4.7)
Similarly, we have by Lemma 3.2 and inequality again that
Moreover, we have by and (A1) (iii) again and that
Hence (i) has been proved.
(ii) The inequalities (a)-(e) can be proved by applying Lemma 3.5. Now we will verify them one by one.
(a) Noting by (A1) (iii) that
and the result follows immediately form Lemma 3.5.
(b) Similarly, noting additionally that we have by Lemma 3.5 again that the result follows.
(c) Note that Therefore, we have
Let we have Lemma 3.5 again that
(d) Observe that Utilizing the Abel Inequality (see Mitrinovic, 1970, Theorem 1, p.32), we have by (A1) and (A3) that
and
Thus the result follow from Lemma 3.5 immediately.
(e) It follows that
Similar to the proofs of (c) and (d), we have
Choosing we have by Lemma 3.5 again that
(f) The inequality (f) can be derived immediately by (i) and Markov's inequality.
(g) It follows from the Abel Inequality and Remark 2.1 again that
The prove is completed.
Lemma 3.7 Let Then under the assumption of Theorem 3.1, we have
Proof. Similar to Lemma 3.6 in Liang and Fan (2009), we can complete the proof of the lemma. The details are omitted.
Assume that are independent random variables and has the same distribution as that of for each Let then
Lemma 3.7 Under the assumptions of Theorem 3.1, we have
Proof. We have by Berry-Esseen inequality (Petrov, 1995, p.154, Theorem 5.7) that
Applying Lemma 3.3 with and noting that we have by choosing that
which together with from Lemma 3.7 yields that
The prove is completed.
Lemma 3.8 Under the assumptions of Theorem 3.1, we have
Proof. Suppose that are characteristic functions of respectively.
We have by Esseen inequality (Petrov, 1995, p.146, Theorem 5.3) that for any
Applying Lemma 3.4 with and Lemma 3.2 with and similar to the proof of (3.2), we have by that
which implies that On the other hand, from Lemma 3.7 and we have
which derives that By choosing we can obtain that
This completes the proof of the lemma.
Proof of Theorem 3.1 We can observe that
It is easy to show that
and
By changing the order of summation, we obtain that
Let Denote
The we have that
It follows from Lemma 3.7-3.9 that
Consequently, from (3.1), Lemma 3.1 and Lemma 3.6, we have
The proof of the theorem is completed.
To prove Theorem 2.2, we need the following lemmas.
Lemma 3.8 Under the assumption of Theorem 2.2, the following statements hold.
Furthermore, if for some positive numbers then
Proof. (i) Similar to the proof of (3.2), we have by that
Similarly,
Noting that analogous to the proof of (3.3) we have
(ii) Noting that
and
the first inequality follows immediately form Lemma 3.5.
Similarly, noting that
we can also get the second one from Lemma 3.5.
(iii) Observe that in Theorem 2.1, the proof of (iii) can be easily obtained by following the proof of Lemma 3.9 in Liang and Fan (2009). The details are omitted here.
Lemma 3.11. Let Then under the assumptions of Theorem 2.2, we have
Proof. Similar to the proof of Lemma 3.10 in Liang and Fan (2009), we can prove the lemma.
The details are omitted.
Assume that are independent random variables and has the same distribution as that of for each Let .
Lemma 3.12 Under the assumptions of Theorem 2.2, we have
Proof. Note that Similar to the proof of Lemma 3.8, we have by Lemma 3.3, and changing the order of summation that
which together with from Lemma 3.11 yields that
The proof is completed.
Lemma 3.13 Under the assumptions of Theorem 2.2, we have
Proof. Suppose that are the characteristic functions of respectively. Similar to the proof of Lemma 3.9, we have
together with Lemma 3.11 and 3.12 yield that
This completes the proof of the lemma.
Proof of Theorem 3.1 We have that
Note that
and
Similar to the decomposition for we denote
Then we have that
we have by Lemma 3.11-3.13 that
which together with (3.4), Lemma 3.1 and Lemma 3.10 yields that
The proof is completed.

References

[1]  Amini M, Roozbeh M., 2016. Least trimmed squares ridge estimation in partially linear regression models. J Stat Comput Simul. 86(14): 2766-2780.
[2]  Amini M, Roozbeh M., 2019. Improving the prediction performance of the LASSO by subtracting the additive structural noises. Comput Stat.34(1):415-432.
[3]  Bradley R.C, Bryc W., 1985. Multilinear forms and measures of dependence between random variables. Journal of Multivariate Analysis, 16, 335-367.
[4]  Cai Z.W., 1991. Strong consistency and rates for recursive nonparametric conditional probability density estimates under (α, β)-mixing conditions. Stochastic Processes and Their Applications, 38, 323-333.
[5]  Carroll R.J., 1982. Adapting for heteroscedasticity in linear models. Annals of Statistics, 10(4), 1224-1232.
[6]  Carroll R.J., H¨ardle W., 1989. Second order effects in semiparametric weighted least squares regression. Statistics, 2, 179-186.
[7]  Chen H., 1988. Convergence rates for parametric components in a partial linear model. Annals of Statistics, 16, 136-146.
[8]  Chen M.H., Ren, Z., Hu S.H., 1998. Strong consistency of a class of estimators in partial linear model. Acta Mathematica Sinica, 41(2), 429-439.
[9]  Engle R.F, Granger C.W.J., Rice J., Weiss G.H., 1986. Semiparametric estimates of the relation between weather and electricity sales. Journal of the American Statistical Association, 81(394), 310-320.
[10]  Fan J., Gijbels I., 1996. Local Polynomial Modelling and Its Applications. Chapman and Hall, London.
[11]  Gao P., 2016. Strong stability of (α, β)-mixing sequences. Applied Mathematics-A Journal of Chinese Uni- versities, Series B, 31(4), 405-412.
[12]  Gao J.T., Chen X.R., Zhao L.C., 1994. Asymptotic normality of a class of estimators in partial linear models. Acta Mathematica Sinica, 37(2), 256-268.
[13]  H¨ardle W., Liang H., Gao J., 2000. Partially linear Models. Springer Verlag.
[14]  Heckman N.E., 1986. Spline smoothing in a partly linear model. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 48, 244-248.
[15]  Horowitz JL., 2009. Semiparametric and nonparametric methods in econometrics: Springer series in statistics. New York: Springer-Verlag.
[16]  Liang H., H¨ardle W., 1997. Asymptotic properties of parametric estimation in partially linear heteroscedastic models. Technical Report no 33. Humboldt-Universit¨at zu Berlin.
[17]  Liang H.Y., Jing B.Y., 2009. Asymptotic normality in partially linear models based on dependent errors. Journal of Statistical Planning and Inference, 139, 1357-1371.
[18]  Lu C.R., Lin Z.Y., 1997. Limit theory for mixed dependent variables. Science Press of China, Beijing.
[19]  Liu X, Wang L, Liang H., 2011. Variable selection and estimation for semiparametric additive partial linear models. Stat Sin. 21:1225-1248.
[20]  Opsomer JD, Ruppert D., 1999. A root-n consistent backfitting estimator for semiparametric additive mod- eling. J Comput Graph Stat. 8:715-732.
[21]  Rice J., 1986. Convergence rates for partially linear spline models. Statistics and Probability Letters, 4, 203-208.
[22]  Robinson P.M., 1987. Asymptotically efficiency estimation in the presence of heteroscedasticity of unknown form. Econometrica, 55, 875-891.
[23]  Roozbeh M, Arashi M, Gasparini M., 2012. Seemingly unrelated ridge regression in semiparametric models. Commun Stat Theory Methods. 41(8):1364-1386
[24]  Roozbeh M, Arashi M., 2016. Shrinkage ridge regression in partial linear models. Commun Stat Theory Methods. 45(20): 6022-6044.
[25]  Roozbeh M., 2018. Optimal QR-based estimation in partially linear regression models with correlated errors using GCV criterion. Comput Stat Data Anal. 117:45-61.
[26]  Schick A., 1996. Root-n consistent estimation in partly linear regression models. Statistics and Probability Letters, 28, 353-358.
[27]  Speckman P., 1988. Kernel smoothing in partial linear models. Journal of the Royal Statistical Society, Series B, Statistical Methodology, 50, 413-436.
[28]  Shao Q.M., 1989. Limit theorems for the partial sums of dependent and independent random variables. University of Science and Technology of China, 1-309, Hefei.
[29]  Shen Y., Zhang Y.J., 2011. Strong limit theorems for (α, β)-mixing random variable sequences. Journal of University of Science and Technology of China, 41(9), 778-795.
[30]  You J., Chen G., 2007. Semiparametric generalized least squares estimation in partially linear regression models with correlated errors. Journal of Statistical Planning and Inference, 137, 117-132.
[31]  Yu C.Q., 2016. Convergence theorems of weighted sum for (α, β)-mixing sequences. Journal of Hubei Uni- versity (Natural Science), 38(6), 477-487.