American Journal of Mathematics and Statistics

p-ISSN: 2162-948X    e-ISSN: 2162-8475

2013;  3(4): 213-219

doi:10.5923/j.ajms.20130304.05

On the Estimation of Survival of HIV/AIDS Patients on Antiretroviral Therapy Using NPMLE Method: An Application to Interval Censored Data

Gurprit Grover1, Rabindranath Das2, Prafulla Kumar Swain1, Barnali Deka1

1Department of Statistics, University of Delhi, Delhi, India

2Department of Statistics, University of Burdwan, West Bengal, India

Correspondence to: Prafulla Kumar Swain, Department of Statistics, University of Delhi, Delhi, India.

Email:

Copyright © 2012 Scientific & Academic Publishing. All Rights Reserved.

Abstract

The main objective of this paper is to estimate the survival of HIV/AIDS patients who are undergoing Antiretroviral Therapy in an ART centre, Delhi. Non Parametric Maximum Likelihood Estimation NPMLE (E-M) for interval censoring and KM survival plot for left, right and mid-point imputation have been used to estimate the survival of these patients. It has been observed that the mid-point imputed survival plot has a very similar and consistent pattern as obtained by NPMLE (E-M) method. Considering these mid-point imputed value as right censored data, Cox PH model and Accelerated Failure time Model (AFTM) have been applied to identify the effects of prognostic factors like age, sex, mode of transmission, baseline CD4 cell count, hemoglobin, baseline weight and smoking habits on survival of the patients. The Akaike Information Criterion (AIC) has been employed to compare the efficiency of these models and Cox-Snell residual is used to test the proportionality assumption.

Keywords: AIDS, ART, CD4 Cell Count, Non Parametric Maximum Likelihood Estimation, Cox-PH Model, Accelerated Failure Time Model

Cite this paper: Gurprit Grover, Rabindranath Das, Prafulla Kumar Swain, Barnali Deka, On the Estimation of Survival of HIV/AIDS Patients on Antiretroviral Therapy Using NPMLE Method: An Application to Interval Censored Data, American Journal of Mathematics and Statistics, Vol. 3 No. 4, 2013, pp. 213-219. doi: 10.5923/j.ajms.20130304.05.

1. Introduction

The widespread use of Antiretroviral Therapy (ART), prognosis of HIV infected patients has significantly improved and mortality due to AIDS related causes has substantially reduced[1, 2, 3]. With the advent of free ART programme in 2004 in India, a dramatic decline of HIV incidence has been observed during the last decade[4]. Thus it is imperative to know the survival of HIV/AIDS patients on ART, so that appropriate strategies can be devised to prevent new infection.
However in HIV/AIDS studies, where time to event data are collected by assessing patients in periodic follow up visits. In such cases, the event cannot be observed exactly (i.e onset of HIV infection, incubation period of AIDS after HIV infection, death time); however it is known to happen within some interval, thus such observed events are interval censored. In HIV dynamics every patient is supposed to visit ART centre after four weeks, but actual visit time vary from patient to patient and also time between visits vary. The patients may visit ART centre at time that is convenient to them rather than scheduled time. Therefore the failure event (i.e death) is considered as interval censored survival data. Survival time Ti (say) of a patient lies in the interval (Li , Ri) that is the last available visit date and end of the study.
Several approaches have been proposed for the estimation of survival function for the interval censored data; Turnbull[5], De Gruttola and Lagakos[6] and Sun[7, 8]. More recently Kim[9], Grover and Shakeri[10], Grover and Banerjee[11] estimated survival of HIV-1 infected children for doubly and interval censored data.
In this paper we have estimated the survival function of HIV/AIDS patients on ART by using Non Parametric Maximum Likelihood Estimation (NPMLE) method for interval censored data. The NPMLE is computed by using the E-M algorithm of Turnbull (1976) with the polishing algorithm of Gentleman and Geyer[12]. We have also estimated survival function by using imputation technique. Due to lack of appropriate estimation procedure and non availability of statistical software for interval censored data, we have adopted conventional imputation approach to convert interval censored data to right censored data, for which the standard techniques are available. Further, we have compared the estimates obtained by NPMLE and imputation method.
Kaplan-Meier plot for left, right and mid-point imputation has been used to estimate the survival of patients. It has been observed that that the mid-point imputed survival plot has a very similar and consistent pattern as obtained by NPMLE (E-M) method. Considering these mid-point imputed values as right censored data at each interval, Cox Proportional Hazard model and Accelerated Failure Time Model (AFTM) has been used to study the effect of prognostic factors like age, sex, mode of transmission, baseline CD4 cell count, hemoglobin, baseline weight and smoking habits on the survival of patients.
The semi parametric Cox PH model is the most common approach in survival analysis and has been used to evaluate the covariate effects on hazard function of failure time data. Interestingly, the result from a PH model is difficult to interpret in terms of survivorship. However, parametric AFTM is being treated as best attractive alternative to Cox PH model. AFT model provides concise and more intuitively interpretable results of survival data[13]. Since no one has used AFTM in HIV population studies in India earlier, we have modeled HIV/AIDS population on ART by using AFTM and compared the results with Cox PH model.
The Akaike Information Criterion (AIC) has been employed to compare the efficiency of models. Schoenfeld residual and Cox-Snell residual method has also been used to test proportionality assumption. The software packages survival and interval in R and STATA(version 11.1) have been used to perform the statistical analyses.

2. Methods Used

2.1. NPMLE for Interval Censored Data

Suppose that n number of HIV/AIDS patients are under ART, Let Ti be the survival time (i.e occurrence of death) of ith patient with survival function S(t). Let (Li , Ri) be the interval in which Ti is being observed, such that Li < Ti i , and if the event does not occur till the end of study then the patient is said to be right censored. In this case we assume that Ti can occur in the interval (Li , ∞), where Li is the time period from the beginning of the study until the last visit. The likelihood function for the set of observed intervals {(Li , Ri) , i= 1,2…n} is then given by
(1)
Turnbull’s EM algorithm approach, based on iterative procedure has been used to estimate the survival function S(t) corresponding to interval censored data. From the data {(Li , Ri) , i= 1,2…n}, a set of non overlapping intervals {(t1,t2),….(t(m-1),tm)} is generated over which the survival curve S(t) is estimated. Let 0= t0< t1…< tm=∞ be the ordered observed survival time where the NPMLE may change. Then for ith patient {Ti ϵ (Li , Ri) , i= 1,2…n}, define an indicator variable
and P(tj) =P[ t(j-1), is given by
(2)
Where Pi(tj)= P[t(j-1)i< tj / Ti ϵ(Li , Ri)] , Now the NPMLE of survival function can be estimated by maximizing L(P) with respect to P, say , by using EM algorithm[14]. Starting with initial estimate ,
E-step, for each i
M-step: update for each j
Iterate the procedure until convergence.

2.2. Imputation Approach and K-M Survival Plot

For simplicity, imputation techniques have been used to handle this interval censored data. After imputation the interval censored data will be treated as complete or right censored data, then standard statistical methods can be performed on the imputed data set (Hsu et al.,[15]).[16, 17] have used mid-point imputation for interval censored AIDS infection time data. Since the survival time Ti is known to lie in the interval {(Li , Ri) , i= 1,2…n} for ith patient. we use the midpoint (mi) of the observed interval as the hypothetical failure time i.e . And also an alternative to midpoint imputation is to take Ti as Li the left end point and Ri the right end point imputation have been used.
Kaplan-Meier survival estimate have been plotted for this imputed data, by using the formula
(3)
where the tj’s denote the distinct imputed exact failure times, dj’s and nj’s are the death and number at risk at each of the tj’s respectively.

2.3. Cox-PH Model

The Cox proportional hazard function[18] for the survival time T and the explanatory variables X is given by
(4)
where h0(t) is the baseline hazard. The explanatory variables act multiplicatively on the baseline hazard that is completely unspecified, and the regression parameter in the model can be estimated by maximizing the partial likelihood function;
(5)
where R(tj) denote the risk set of patients at time tj

2.4. Accelerated Failure Time Model (AFTM)

Accelerated Failure Time Model[19] assumes a linear relationship between logarithm of survival time and covariates, is given by
(6)
where is a vector of regression coefficients, and are intercept and scale parameters respectively and the error term is assumed to have some distribution(i.e extreme value, normal or logistic). This transformation leads to the Weibull, Lognormal or log logistic AFT models for Ti[20].
Now, the survivor function of Ti is given by
(7)
The AFT models are fitted by using method of maximum likelihood. The likelihood of n observed survival times t1,t2,…tn is given by
where fi(ti) and Si(ti) are the density and survival function for ith patients at time ti, and δi is the event indicator function, such that
Results obtained from AFT models can be summarized in the exponentiated form as time ratio (i.e TR(=exp()) unlike Cox model hazard ratio. Thus TR>1, associated with prolonged survival time and TR<1, associated with a decrease in survival time.

2.5. Model Comparison

In order to compare parametric AFTM and semi parametric Cox PH Model, we have used Akaike Information Criterion (AIC). The AIC which is a measure of goodness of fit of an estimated statistical model is given by
AIC= -2*Log-likelihood +2(p+k)where p is the number of covariates in the model, k=1 for exponential and k=2 for weibull and lognormal models. Model with smaller AIC regarded as a better model.

3. Data Sources

A retrospective follow-up study was conducted, involving 1259 HIV/AIDS patients who were undergoing Antiretroviral Therapy in the ART centre of Dr. Ram Manohar Lohia Hospital, New Delhi, India, during the period of April 2004 to November 2009, and were followed up through the ART routine register records till December 2010. Taking preliminary inclusion criteria as patients should be above 18 years of age and on the basis of availability of complete baseline information on CD4 cell count, date of visit, mode of transmission, weight and hemoglobin etc, 1259 patients were found eligible for the analysis. At the end of the study period, 198 patients were dead and remaining 1061 patients were known to be alive, since the exact date of death is not known, the event death is known to lie in an interval that is the last available CD4 count date and end of the study. Thus the observed event leads to interval censored survival data.
Table 1: presents the descriptive statistics of the study, out of 1259 patients 408 (32.4%) were female. The mean age at diagnosis was 34.24(±8.74) years and the most common mode of transmission was sexual (hetro+homo) 64.0%, and for 331(26.3%) patients mode of transmission were unknown. 858(68.1%) patients had CD4 cell count less than 200 cell/mm3 at the time of enrollment. 198 (15.7%) patients died and remaining were alive at the end of the study.
Table 1. Descriptive Statistics
     
The Figure 1: shows NPMLE, K-M survival plot for left, right and mid-point imputation. Comparing these curves we can see that the K-M survival plot for mid-point imputation and NPMLE for intervals are very similar. Then assuming these data as right censored at mid-point of each interval, Cox Proportional Hazard Model and AFTM have been employed to study the effect of prognostic factors on the survival time.
Figure 1. Estimated survival function based on intervals and imputation
Table 2: presents the results of Cox PH model. The prognostic factors viz. sex, CD4 cell count, past smokers, baseline hemoglobin and baseline weight were found to be statistically significant (P< 0.000). Patients who had CD4 cell count less than 200 mg/mm3 (HR=3.31, 95% CI: 2.19, 4.98), sexually infected (HR=1.12, 95% CI: 0.75, 1.65) and current smokers (HR=2.14, 95% CI: 1.37, 3.34) had an increased hazard for death.
Table 2. Cox-PH Model for HIV/AIDS patients
     
Figure 2. Cumulative hazard plot of Cox-Snell residual for AFT models
Table 3. Parametric Accelerated Failure Time Model (AFTM) for HIV/AIDS patients
     
The parametric AFTM results have been presented in table 3, besides the factors found significant in Cox model, unknown mode of transmission was also found to be a significant predictor in Weibull and lognormal AFT model. Patients who had CD4 cell count<200 cells/mm3 have shorter survival time than patients with CD4 cell count more than 200 cells/mm3 (as TR<1), female patients had longer survival than their male counterparts (TR>1). An increase in survival time was associated with per unit increase in hemoglobin. Old aged patients had shorter survival than young patients, and also an increase in weight (in kg) leads to increase in life expectancy.
According to Akaike Information Criterion (AIC), Weibull AFTM was found to be better (smallest AIC) one among the parametric and Cox model. Proportional assumptions were hold (confirmed by Schoenfeld residual plot, not shown here) and the Cox-Snell residual fitted well to all the AFT models shown figure 2.

4. Discussion

In this study we have tried to estimate the survival of HIV/AIDS patients under interval censoring mechanism, since the event of interest death lie in an interval i.e the last available CD4 count date and end of the study. Due to lack of well known statistical methodology and available statistical software, we have adopted imputation approach to handle interval censored data, so that the interval censored data converts to right censored data to which the standard technique can be applied. We have observed that survival function obtained by NPMLE (E-M) for interval censored data has a very close resemblance to the survival function obtained at the mid-point of the interval by Kaplan-Meier method. This is corroborated with the findings of previous studies[10, 17]. However our aim was to determine the effects of prognostic factors age, sex, MOT, baseline CD4 cell, baseline hemoglobin, baseline weight and smoking habits on the survival.
The prognostic factors viz. sex, CD4 cell count, past smokers, baseline hemoglobin and baseline weight are found to be statistically significant (P< 0.000) by both semi parametric Cox PH and parametric AFT Model. Most of the previous studies have suggested that the age is a significant prognostic factor[21, 22, 23]. As age increases the survival time of HIV/AIDS patients decreases. Old age is associated with high risk of disease progression but in our analysis age is not found to be a significant prognostic factor. Also females are observed to have better survival than their male counterpart. As reported previously female had higher life expectancies than male[21, 24, 25, 26]. Remafedi et al.[27], had proposed that sex of the patients does not have any significant effect on survival time.
Consistent with the published literature CD4 cell count is found to be an important prognostic marker of HIV/AIDS patients. Patients with CD4 cell <200 cells/mm3 have 3.31 times more hazard to die than patient who had CD4 cell more than 200 cells/mm3. The mortality is inversely proportional to CD4 count, cumulative probability of AIDS and death increased substantially with decreasing CD4 cell count[23, 28, 29].
Another important result of our study is that patients with sexual (hetro or homo) mode of transmission had worst survival than patients with blood and intravenous drug user mode of transmission, which is departed from earlier result of[21]. where they found that intravenous drug user mode of infection had worst survival. However, Remafedi et al[27] have shown that there were no significant differences between deceased and other subjects in relation to mode of transmission. Unknown mode of transmission was found to be a significant factor in weibull AFTM and Lognormal AFTM result.
Patients who are current smokers had association with more hazards of immune deterioration. We found that baseline hemoglobin was a significant predictor of HIV/ AIDS patients. To our knowledge hemoglobin has never shown to predict mortality of patients on ART in India, further studies needed to confirm our findings. Baseline hemoglobin level can be used as a simple and practical tool for initial risk assessment in the absence of CD4 cell count and viral load, as was identified in earlier studies by[30] in Tanzania and[31] in Europe. Patient’s weight is positively associated with survival, this is corroborating to the findings of[22] that patients improved clinically with regard to weight and hemoglobin.
There are many situations where AFTM provides better description of data than Cox PH model[32, 33, 34]. Our aim of this study was to compare the performance of both semi-parametric Cox PH model and Parametric AFTM. Based on AIC, Weibull AFT model is found to be efficient (smallest AIC) among the parametric and Cox PH model.
There are some limitations of our study, firstly complete information on patients treatment, follow up data on clinical parameter were not available. Secondly, we have analyzed only one ART centre data, so generalization of our findings would need further conformity.

References

[1]  Williams B, Lima V and Gouws E.(2011) Modelling the Impact of Antiretroviral Therapy on the Epidemic of HIV. Current HIV Research. 9(6):367-382.
[2]  Stringer JSA, Zulu I, Levy J et al.,(2006). Rapid scale-up of antiretroviral therapy at primary care sites in Zambia: feasibility and early outcomes. JAMA. 296(7): 782-93.
[3]  Mahy M, Stover J, Stanecki K, Stoneburner R and Tassie JM.(2010). Estimating the impact of antiretroviral therapy: regional and global estimates of life years gained among adults, Sex Transm Infect. 86(Suppl 2):67-71.
[4]  National Aids Control Organization (NACO), Annual Report, 2011-12. Available athttp://www.nacoonline.org/upload/Publication/Annual%20Report/NACO_AR_Eng%202011-12.pdf (Accessed on 8th July 2012).
[5]  Turnbull B.W (1974). The Empirical Distribution Function with Arbitrary Grouped, Censored and Truncated Data, Journal of Royal Statistical Society, series B, 38,290-295.
[6]  De Gruttola V and Lagakos S. (1989). Analysis of doubly censored survival data, with application to AIDS. Biometrics, 45, 1-11.
[7]  Sun J. (1996) A non-parametric test for interval censored failure time data with application to AIDS studies. Statist Med, 15; 1387-1395.
[8]  Sun J. (2006). The Statistical Analysis of Interval Censored Failure Time Data. Springer, New York.
[9]  Kim M.Y and Xue X, (2002). The analysis of multivariate interval censored survival data, Statistics in Medicine, 21: 3715-3726.
[10]  Grover G and Banerjee T (2011). Estimation of Survival Times of HIV-1 Infected Children for doubly censored data, Electron. J.App.Sta.Anal, vol.4, issue 2,155-163.
[11]  Grover G and Shakeri N. (2007). Non parametric estimation of survival function of HIV+ patients with doubly censored data. J. Communicable Dis. 39(1): 7-12.
[12]  Gentleman R and Geyer C.J.(1994). Maximum likelihood for interval censored data: Consistency and Computation. Biometrika. 81, 618-623.
[13]  Wei L.J (1992) The Accelerated failure time model: a useful alternative to cox regression model in survival analysis. Statistics in Medicine, 11, 1871-1879.
[14]  Dempster A.P , Laird, N.M and Rubin D.B (1977). Maximum Likelihood from incomplete Data via the EM Algorithm, Journal of the Royal Statistical Society, Series B, 39, 1-38.
[15]  Hsu C.H, Taylor J.M.G and Murray S (2004). Multiple Imputation For Interval Censored Data With Auxiliary Variables, Working paper;Berkeley Electronic Press, paper 26.
[16]  Brookmeyer, R and Goedert J.J (1989). Censoring in an epidemic with an application to hemophilia-associated AIDS, Biometrics, 45, 325-335.
[17]  Law, C and Brookmeyer, R. (1992). Effects of midpoint imputation on the analysis of doubly censored data, Statistics in Medicine, 11: 1569-1578.
[18]  Cox D.R (1972) Regression Models and Life Tables, Journal of Royal Statistical Society, series B, 20,187-220.
[19]  Kalbfleish J.D and Prentice R.L. (1980). The Statistical Analysis of Failure Time Data: Wiley, New York.
[20]  Collett D (2003). Modeling Survival Data in Medical Research, Chapman and Hall, London.
[21]  Gadpayle AK, Kumar N, Duggal A et al,. (2012) Survival trend and prognostic outcome of AIDS patients according to age, sex, stages and mode of transmission-A retrospective study at ART centre of a tertiary care hospital. JIACM, 13(4): 291-8.
[22]  Bachani D, Garg R, Rewari BB, et.al.(2010). Two year treatment outcomes of patients enrolled in india’s national first-line antiretroviral therapy programme, National Medical Journal of India. 23(1):7-12.
[23]  Ghate M, Deshpande S, Tripathy S, et al.(2011). Mortality in HIV infected individuals in Pune, India. Indian J Med Res. 133:414-420.
[24]  Antiretroviral Therapy Cohort Collaboration.(2008). Life expectance of individuals on combination antiretroviral therapy in high income countries: a collaborative analysis of 14 cohort studies. Lancet, 372(9635): 293-9.
[25]  Farzadegan H, Hoover DR, Astemborski J, et al.(1998). Sex differences in HIV1 viral load and progression to AIDS. Lancet. 352:1510-1514.
[26]  Donnelly CA, Bartley LM, Ghani AC, et al.(2005). Gender differences in HIV1 RNA viral loads. HIV Medicine. 6(3):170-178.
[27]  Remafedi G, Lauer T.(1995). Survival trends in adolescents with human immunodeficiency virus infection. Arch Pediatr Adolesc Med. 149(10):1093-8.
[28]  Antiretroviral Therapy Cohort Collaboration.(2009). Timing of initiation of antiretroviral therapy in AIDS-free HIV-1 infected patients: a collaborative analysis of 18-HIV cohort studies. Lancet 6736(09) 60612-7.
[29]  May M, Sterne JA, Sabin C et al.,(2007). Prognosis of HIV-1 infected patients up to 5 years after initiation of HAART: collaborative analysis of prospective studies. AIDS 21; 1185-97.
[30]  Johannessen A, Naman E, Ngowi EN et al.,(2008). Predictors of mortality in HIV-infected patients starting antiretroviral therapy in a rural hospital in Tanzania, BMC Infectious Disease, 8: 52.
[31]  Mocroft A, Kirk O, Barton SE et al.,(1999). Anaemia is an independent predictive marker for clinical prognosis in HIV-infected patients from across Europe. EuroSIDA study group, AIDS, 13: 943-950.
[32]  Kay R and Kinnersley N (2002). On the use of accelerated failure time model as an alternative to the proportional hazard model in the treatment of time to event data: a case study in influenza, Drug information Journal, vol-36, pp; 571-579.
[33]  Orbe J, Ferreira E and Nunez-Anton V,(2002). Comparing proportional hazards and accelerated failure time models for survival analysis. Statist Med, 21; 3493-3510.
[34]  Nardi A and Schemper M (2003). Comparing Cox and Parametric models in clinical studies, Statist. Med, 22: 3597-3610.