© 1998 by Biometrika Trust
Locally efficient estimation of the survival distribution with right-censored data and covariates when collection of data is delayed
Division of Biostatistics, University of California Berkeley, California 94720, U.S.A.laan{at}stat.berkeley.edu
Division of Biostatistics, University of California Berkeley, California 94720, U.S.A.hubbard{at}stat.berkeley.edu
For many sources of survival data, there is a delay between the recording of vital status and its availability to the analyst, and the Kaplan-Meier estimator is typically inconsistent in these situations. In this paper we identify the optimal estimation problem. As a result of the curse of dimensionality, no globally efficient nonparametric estimator exists with a good practical performance at moderate sample sizes. Following the approach of Robins & Rotnitzky (1992), given a correctly specified model for the hazard of censoring conditional on the delay process and T, we propose a closed-form one-step estimator of the distribution of T whose asymptotic variance attains the efficiency bound, if we can correctly specify a lower-dimensional working model for the conditional distribution of T given the ascertainment process. The estimator remains consistent and asymptotically normal even if this latter submodel is misspecified. In particular, if we choose as working model independence between T and the ascertainment process, then the estimator is efficient when this holds and remains consistent and asymptotically linear otherwise. Moreover, we incorporate in our data structure a covariate process that is observed during the follow-up time and is reported with the same delays. We propose closed-form locally efficient estimators of the type described above which use all the data and allow for dependent censoring.
Key Words: Asymptotically efficient Asymptotically linear estimator Cox proportional hazards model Influence curve curve Rightcensored data