International Journal of Statistics and Applications
p-ISSN: 2168-5193 e-ISSN: 2168-5215
2013; 3(5): 141-154
doi:10.5923/j.statistics.20130305.01
Kasahun Takele
Haramaya University, Department of Statistics, Ethiopia
Correspondence to: Kasahun Takele, Haramaya University, Department of Statistics, Ethiopia.
| Email: | ![]() |
Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.
Malnutrition among children under age five is the major public health problem in the developing world particularly in Ethiopia. The aim of this study was then to determine statistically the determinants of children malnutrition, using 2011 DHS data. The overall prevalence of underweight among children in Ethiopia was 36.4%. Bayesian Semi-parametric regression model was used to flexibly model the effects of selected socioeconomic, demographic, health and environmental covariates. Inference was made using Bayesian approach with Markov chain Monte Carlo (MCMC) techniques. It was found that the covariates sex of child, preceding birth interval, birth order of child, place of residence, mother’s education level, toilet facility, number of household members, household economic status, cough, diarrhea and fever were the most important determinants of children nutritional status in Ethiopia. The effect of child Age, mother’s age at child birth, and mother’s body mass index were also explored non-parametrically as determinants of children nutritional status. It is suggested that for reducing childhood malnutrition, due emphasis should be given in improving the knowledge and practice of parents on appropriate young child feeding practice and frequent growth monitoring together with appropriate and timely interventions.
Keywords: Undernutrition, Underweight, Semi-Parametric Regression Model, MCMC
Cite this paper: Kasahun Takele, Semi-Parametric Analysis of Children Nutritional Status in Ethiopia, International Journal of Statistics and Applications, Vol. 3 No. 5, 2013, pp. 141-154. doi: 10.5923/j.statistics.20130305.01.
where AI refers to the child`s anthropometric indicator (weight at a certain age in our case), MAI refers to the median of the reference population and σ refers to the standard deviation of the reference population. Weight‐for‐age z‐score is an indicator of the nutritional status of a child. Here the main interest is in modelling the dependence of nutritional status on covariates including the age of the child, the body mass index of the child`s mother, the district the child lives in, mother education, mother working status, sex of child, birth order and birth interval, household economic status, and health and environmental conditions.
|
![]() | (1) |
are smooth functions. These functions are not given a parametric form but instead are estimated by nonparametric methods.While Gaussian models can be used in many statistical applications, there are types of problems for which they are not appropriate. For example, the normal distribution may not be adequate for modeling discrete responses such as counts, or bounded responses such as proportions. Thus, generalized additive models can be applied to a much wider range of data analysis problems. Generalized additive models consist of a random component, an additive component, and a link function relating these two components. Generalized additive models[14] assume that, the response Y, the random component, has density in the exponential family. That is, conditional on covariates xi, the responses yi are independent and the distribution of yi belongs to a simple exponential family, which is expressed as: ![]() | (2) |
is the natural parameter of the exponential family,
is a scale or dispersion parameter common to all observations, and b (.) and c (.) are functions depending on the specific exponential family.Moreover, the conditional expectation
and with link function g (.) we have![]() | (3) |
in the Generalized Additive models can be expressed as![]() | (4) |
are smooth functions that define the additive component. Finally, the relationship between the mean μ of the response variable and
is defined by a link function.A generalized additive regression model is a special case of the generalized linear models, but they serve different analytic purposes. Generalized linear models emphasize estimation and inference for the parameters of the model, while generalized additive models focus on exploring data non-parametrically. ![]() | (5) |
is a vector of continuous covariates,
is a vector of regression coefficients for the continuous covariates.
is a vector of categorical covariates.
is a vector of regression coefficients for the categorical covariates. In the Bayesian parametric regression model, the parameter vectors β and γ one routinely assume diffuse priors
and
A possible alternative would be to work with a multivariate Gaussian distribution γ~ N (γ0, Σγ0) and β ~ N (β0, Σ β0). However, since in most cases a non‐informative prior is desired, it is sufficient to work with diffuse priors. In this study, the continuous covariates child’s age (Cage), the mother’s age at birth (Mage), and the mother’s Body Mass Index (BMI kg/m2) are assumed to have non-linear effects on child nutritional status. Hence, it is necessary to seek for a more flexible approach for estimating the continuous covariates by relaxing the parametric linear assumptions, by considering their true functional forms. This can be done using an approach referred to as non-parametric regression model. Non-parametric regression analysis is regression without an assumption of linearity. The scope of non-parametric regression is very broad, ranging from "smoothing" the relationship between two variables in a scatter plot to multiple-regression analysis and generalized regression models (for example, logistic non-parametric regression for a binary response variable). To specify a non-parametric regression model, an appropriate function that contains the unknown regression function needs to be chosen. This choice is usually motivated by smoothness properties, which the regression function can be assumed to possess. The semi-parametric regression model is obtained by extending model (5) as follows:![]() | (6) |
are smooth functions of the continuous covariates.
and parameters
as well as the variance parameter (τ2) are considered as random variables and have to be supplemented with appropriate prior assumptions. In the absence of any prior knowledge we assume independent diffuse priors
for the parameters of fixed effects. Another common choice is highly dispersed Gaussian priors. Several alternatives are available as smoothness priors for the unknown functions
. Among the others, random walk priors[20], Bayesian Penalized-Splines[21], Bayesian smoothing splines (Hastie and Tibshirani, 2000)[22] are the most commonly used. In this study, the Bayesian smoothing spline was used by taking cubic P‐spline with second order random walk priors[23, 16].Suppose that
is the vector of corresponding function evaluations at observed values of X. Then, the prior for f is![]() | (8) |
follows a partially improper Gaussian prior
where
is a generalized inverse of a band‐diagonal precision or penalty matrix K. it is possible to express the vector of function evaluations
of a nonlinear effect as the matrix product of a design matrix Xj and a vector of regression coefficients βj,
Brezger and Lang (2006)[24] also suggest a general structure of the priors for βj as:![]() | (9) |
For full Bayesian inference, the unknown variance parameters τ2 are also considered as random and estimated simultaneously with the unknown regression parameters. Therefore, hyperpriors are assigned to the variances τ2 in a further stage of the hierarchy by highly dispersed (but proper) inverse Gamma priors
.![]() | (10) |
Another choice would be to work with a multivariate Gaussian distribution
In this study, diffuse priors was used for the fixed effects parameter γ. Bayesian P-splineAny smoother depends heavily on the choice of smoothing parameters for p-spline in a mixed (fixed and continuous) framework. A closely related approach for continuous covariates is based on the P‐splines approach introduced by[25].This approach assumes that an unknown smooth function fj of a covariate Xj can be approximated by a polynomial spline of degree l defined on a set of equally spaced knots
within the domain of Xj. The domain from xmin to xmax can be divided into n’ equal intervals by d’+1 knots, Each interval will be covered by l+1 P-splines of degree l, The total number of knots for construction of the P-spline will be d’+2l+1. The number of P-splines in the regression is d’+1. It is well known that such a spline can be written in terms of a linear combination of Mj = d + l P-spline basis functions Bm, i.e.,![]() | (11) |
corresponds to the vector of unknown regression coefficients. The n*mj design matrix
consists of the basic functions evaluated at the observations
The crucial choice is the number of knots: for a small number of knots, the resulting spline may not be flexible enough to capture the variability of the data; for a large number of knots, estimated curves tend to over fit the data and, as a result, too rough functions are obtained. As a remedy,[25] suggest a moderately large number of equally spaced knots (usually between 20 and 40) to ensure enough flexibility, and to define a roughness penalty based on first or second order differences of adjacent P-Spline coefficients to guarantee sufficient smoothness of the fitted curves. In our analysis, we will typically choose P‐splines of degree 3 and 20 intervals, and second order random walk priors on the P‐splines regression coefficients. Hence, it is used to flexibly capture the variability of the data.First and second order random walk priorsLet us consider the case of a continuous covariate X with equally spaced observations
. Suppose that
defines the ordered sequence of distinct covariate values. Here n denotes the number of different observations for x in the data set. A common approach in dynamic or state space models is to estimate one parameter
for each distinct
Define
and let
denote the vector of function evaluations. Then a first order random walk prior for f is defined by:![]() | (12) |
![]() | (13) |
![]() | (14) |
![]() | (15) |
follows a partially improper Gaussian prior
Where,
is a generalized inverse of the penalty matrix K.For the case of non-equally spaced observations random walk priors must be modified to account for non-equal distances
t = x (t)-x (t-1) between observations. Random walks of first order are now specified by f(t) = f(t - 1) + u(t); u(t)~ N(0;
t τ 2); i. e. by adjusting error variances from τ 2to
Random walks of second order are defined by ![]() | (16) |
of all parameters and the full likelihood function
. For this case, let
be the vector of all unknown parameters, then the posterior distribution is given by:![]() | (17) |
|
+f1(MBI)+f2(Mage)+f3(Cage)+Csexγ1+Resγ2+BORDγ3+pint1γ4+Pbint2γ5 +HHM1γ6+HHM2γ7+Medu1γ8+Medu2γ9+tfacilitγ10+Vacγ11+Coughγ12+Drrhγ13+feverγ14+windexγ15 ![]() | Figure 5. Non-linear Effects of Child’s Age in months on Nutritional Status of a Child |
![]() | Figure 6. Non-linear Effects of Mother’s Age in years on Nutritional Status of a Child |
![]() | Figure 7. Non-linear Effects of Mother’s Body Mass Index (kg/m2) on Nutritional Status of a Child |
| [1] | World Bank (2007). Nutritional Failure in Ecuador: Causes, Consequences, and Solutions. The World Bank: Washington, DC. |
| [2] | UNICEF (2009). Tracking Progress on Child and Maternal Nutrition, a Survival and Development Priority. |
| [3] | Maleta, K. (2006). Epidemiology of Undernutrition in Malawi, Chapter 8 in The Epidemiology of Malawi, Edited by Eveline Geubbles and Cameron Bowie. |
| [4] | HKI (2001). Reduction in diarrheal diseases in children in rural Bangladesh by environmental and behavioral modifications. Transactions of the Royal Society of Tropical Medicine and Hygiene |
| [5] | WHO (1995). Physical Status: The Use and Interpretation of Anthropometry. WHO Technical Report Series No. 854. Geneva. |
| [6] | Kandala, NB; Fahrmeir L; Klasen S; Priebe J (2009). Geo-additive Models of Childhood Undernutrition in Three Sub-Saharan African Countries. Population, Space and Place. |
| [7] | Klasen, S. and Moradi, A. (2000). The Nutritional Status of Elites in India, Kenya and Zambia: An Appropriate Guide for Developing Reference Standards for Undernutrition? Sonderforschungsbereich 386: Discussion Paper no. 217. Deutsche Forschungsge- meinschaft. |
| [8] | Kibel ,M.; Saloojee; H. & Westwood, T. (2007). Child Health for All. (4th Edition). Oxford University Press: Cape Town (South Africa). |
| [9] | FAO/WFP (2009). Special Report on Crop and Food Security Assessment Mission to Ethiopia: Integrating the Crop and Food Supply and the Emergency Food Security Assessments. Rome, Italy. |
| [10] | NNS (2009). Ethiopia National Nutrition Strategy Review and Analysis of Progress and Gaps: One Year On May 2009. |
| [11] | Woldemariam Girma and Timotiows Genebo (2002). Determinants of Nutritional Status of Women and Children in Ethiopia |
| [12] | Hastie, T. and Tibshirani, R. (1990). Generalized Additive Models. Chapman and Hall, London. |
| [13] | Khaled, K. (2007). Child Malnutrition in Egypt Using Geoadditive Gaussian and Latent Variable Models. |
| [14] | Mohammad, A. (2008). Gender Differentials in Mortality and Undernutrition in Pakistan: Peshawar (Pakistan). |
| [15] | Mila, A.L.; Yang, X.B. and Carriquiry, A.L. (2003). Bayesian Logistic Regression of Clinical Epidemiology for Uncertainty in Parameter Estimation. Basic Science for Clinical Medicine: Little, Brown and Company, Boston. |
| [16] | Fahrmeir, L. and Lang, S. (2001). Bayesian Inference for Generalized Additive Mixed Models Based on Markov Random Field Priors. Applied Statistics (JRSS C), Vol 50, P. 201‐220. |
| [17] | Fahrmeir, L. and Lang, S. (2004). Bayesian Semiparametric Regression Analysis of Multicategorical Time-Space Data. To appear in Ann. Inst. Statist. Math. |
| [18] | Hastie, T. and Tibshirani, R. (2004). Bayesian Back Fitting. (To Appear in Statistical Science). |
| [19] | Kandala, NB.; Fahrmeir, L and Klasen, S. (2010). Geo-additive Models of Childhood Undernutrition in Three Sub-Saharan African Countries. Sonderfor |
| [20] | Brezger, A. and Lang S. (2006). Generalized Structured Additive Regression based on Bayesian P-Splines. Computational Statistics and Data Analysis, Vol 50,P. 967-991. |
| [21] | Khaled, K. (2010). Child Malnutrition in Egypt Using Geoadditive Gaussian and Latent Variable Models. |
| [22] | Spiegelhalter, DJ (2002). Bayesian measures of model complexity and fit. |
| [23] | Belitz, C.; Brezger A., Kneib T.; Stefan L. (2009). BayesX Software for Bayesian Inference in Structured Additive Regression. Department of Statistics, Ludwig Maximilians University Munich. Version, 2.0.1. |
| [24] | CSA (2011). Ethiopian Demographic and Health Survey. Addis Ababa |
| [25] | Olivier, Francois (May, 2011). Deviance Information Criteria for Model Selection in Approximate Bayesian Computation. Institute Pasteur, Human Evolutionary Genetics, Paris, France. |
| [26] | Das , S., Hossain; M.Z., & Islam, M.A. (2008). Predictors of Child Chronic Malnutrition in Bangladesh. Proc.Pakistan Acad. Sci. 45(3): P.137-155. |
| [27] | Kandala ,NB; S. Lang; S. Klasen, and L. Fahrmeir (2006). Semiparametric Analysis of the Socio-Demographic Determinants of Undernutrition in Two African Countries. Research in Official Statistics, EUROSTAT, Vol. 4 No.1:P. 81-100. |
| [28] | Sasha, F. (2009). An Analysis of Under-Five Nutritional Status in Lesotho: The Role of Parity Order and Other Socio-Demographic Characteristics. |
| [29] | Smith, L. C. and Haddad L. (1999). Explaining Child Malnutrition in Developing countries: a Cross Country Analysis. International Food Policy Research Institute, FCND Discussion paper, USA. |
| [30] | Semba, R. D.; de Pee, S., Sun, K.; Sari, M.; Akhter, N., & Bloem, M.W. (2008). Effect of Parental Formal Education on Risk of Child Stunting in Indonesia and Bangladesh: A Cross Sectional Study. Lancet 371(9609):P.322-328. |
| [31] | Rahman, A. & Chowdhury, S. (2007). Determinants of Chronic Malnutrition Among Preschool Children in Bangladesh. Journal of Biosocial Science 39(2):P.161-173. |
| [32] | Sereebutra, P.; Solomons, N.; Aliyu, M.H., & Jolly, P.E. (2006). Socio-Demographic and Environmental Predictors of Childhood Stunting in Rural Guatemala. Nutrition Research. |
| [33] | WHO (2011). World Health Statistics, (The Millennium Development Goals Report), United Nations 2011. |
| [34] | Birhan Fetene (2010). Determinants of Nutrition and Health Status of Children in Ethiopia: A Multivariate Multilevel Linear Regression Analysis. Addis Ababa University |
| [35] | Som, S.; Pal M.; Bhattacharya; B. Bharati, S. & Bharati, P. (2006). Socio-Economic Differentials in Nutritional Status of Children in the States of West Bengal and Assam, India. Journal of Biosocial Sciences. Vol.38: P.625-642 |