Biometrika Advance Access originally published online on January 26, 2009
Biometrika 2009 96(1):37-50; doi:10.1093/biomet/asn069
Articles |
Partial and latent ignorability in missing-data problems
Department of Statistics, University of Connecticut, Storrs, Connecticut 06269, U.S.A. oharel{at}stat.uconn.edu
The Methodology Center, The Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A. jls{at}stat.psu.edu
Received for publication 1 March 2006. Revision received 1 June 2008.
When an assumption of missing at random is untenable, it becomes necessary to model missing-data indicators, which carry information about the parameters of the complete-data population. Within a given application, however, researchers may believe that some aspects of missingness are ignorable but others are not. We argue that there are two different ways to formalize the notion that only part of the missingness is ignorable. These approaches correspond to assumptions that we call partially missing at random and latently missing at random. We explain these concepts and apply them in a latent-class analysis of survey questions with item nonresponse.
Key Words: Missing not at random Multiple imputation Nonignorable missingness
References
-
Agresti A. Modelling patterns of agreement and disagreement. Statist. Meth. Med. Res. (1992) 1:201–18.[CrossRef]
Bergan J. R. Latent-class models in educational research. Review of Research in Education—Gordon E. W., ed. (1983) 10. Washington, DC: American Educational Research Association. 305–60.
Biemer P. P., Woltmann H., Raglin D., Hill J. Enumeration accuracy in a population census: An evaluation using latent class analysis. J. Offic. Statist. (2001) 17:129–48.
Chung H., Flaherty B. P., Schafer J. L. Latent-class logistic regression: Application to marijuana use and attitudes among high-school seniors. J. R. Statist. Soc. (2006) A 169:723–43.[CrossRef]
Clogg C. C., Goodman L. A. Latent structure analysis of a set of multidimensional contingency tables. J. Am. Statist. Assoc. (1984) 79:762–71.[CrossRef][Web of Science]
Crowder M. J. Classical Competing Risks (2001) London: Chapman & Hall.
Diggle P. J., Kenward M. G. Informative dropout in longitudinal data analysis (with Discussion). Appl. Statist. (1994) 43:49–94.[CrossRef]
Frangakis C. E., Rubin D. B. Addressing complications of intent-to-treat analysis in the combined presence of all-or-none treatment-non-compliance and subsequent missing outcomes. Biometrika (1999) 86:365–79.
Garrett E. S., Zeger S. L. Latent class model diagnosis. Biometrics (2000) 56:1055–67.[CrossRef][Web of Science][Medline]
Glynn R. J., Laird N. M., Rubin D. B. Multiple imputation in mixture models for nonignorable nonresponse with follow-ups. J. Am. Statist. Assoc. (1993) 88:984–93.[CrossRef][Web of Science]
Goodman L. A. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika (1974) 61:215–31.
Groves R. M., Dillman D. A., Eltinge J. L., Little R. J. A. Survey Nonresponse (2002) New York: Wiley.
Hedeker D., Gibbons R. D. Application of random-effects pattern-mixture models for missing data in longitudinal studies. Psychol. Meth. (1997) 2:64–78.[CrossRef][Web of Science]
Heitjan D. F. Ignorability, sufficiency and ancillarity. J. R. Statist. Soc. (1997) B 59:375–81.[CrossRef]
Hoijtink H. Constrained latent class analysis using the Gibbs sampler and posterior predictive p-values: Applications to educational testing. Statist. Sinica (1998) 8:691–711.
Kendler K. S., Karkowski L. M., Walsh D. The structure of psychosis: latent class analysis of probands from the Roscommon Family Study. Arch. Gener. Psychiat. (1998) 55:492–99.[CrossRef]
Kenward M. G. Selection models for repeated measurements with nonrandom dropout: An illustration of sensitivity. Statist. Med. (1998) 17:2723–32.[CrossRef]
Kenward M. G., Molenberghs G. Likelihood-based frequentist inference when data are missing at random. Statist. Sci. (1998) 13:236–47.[CrossRef]
Lanza S. T., Collins L. M., Schafer J. L., Flaherty B. P. Using data augmentation to obtain standard errors and conduct hypothesis tests in latent class and latent transition analysis. Psychol. Meth. (2005) 10:84–100.[CrossRef][Web of Science][Medline]
Lanza S. T., Lemmon D. R., Schafer J. L., Collins L. M. PROC LCA & PROC LTA User's Guide (2008) University Park, PA: The Methodology Center, The Pennsylvania State University.
Littell R. C., Milliken G. A., Stroup W. W., Wolfinger R. D. SAS System for Linear Mixed Models (1996) Cary, NC: SAS Institute.
Little R. J. A. Pattern-mixture models for multivariate incomplete data. J. Am. Statist. Assoc. (1993) 84:125–34.
Little R. J. A. Modeling the drop-out mechanism in repeated-measured studies. J. Am. Statist. Assoc. (1995) 90:1112–21.[CrossRef][Web of Science]
Little R. J. A., Rubin D. B. Statistical Analysis with Missing Data (2002) 2nd ed. New York: John Wiley.
McCutcheon A. L. Latent Class Analysis (1987) Newbury Park, CA: Sage Publications.
McCutcheon A. L. Multiple group association models with latent variables: An analysis of secular trends in abortion attitudes, 1972–1988. Sociol. Methodol. (1996) 26:79–111.[CrossRef][Web of Science][Medline]
McLachlan G., Peel D. Finite Mixture Models (2000) New York: Wiley.
Muthén L. K., Muthén B. O. Mplus User's Guide (1998) Los Angeles: Muthén & Muthén.
Rosenbaum P. R. Observational Studies (2002) 2nd ed. New York: Wiley.
Rosenbaum P. R., Rubin D. B. The central role of the propensity score in observational studies for causal effects. Biometrika (1983) 70:41–55.
Rubin D. B. Inference and missing data. Biometrika (1976) 63:581–92.
Rubin D. B. Formalizing subjective notions about the effect of nonrespondents in sample surveys. J. Am. Statist. Assoc. (1977) 72:538–43.[CrossRef][Web of Science]
Rubin D. B. Multiple Imputation for Nonresponse in Surveys (1987) New York: Wiley.
Schafer J. L. Analysis of Incomplete Multivariate Data (1997) London: Chapman & Hall.
Scharfstein D. O., Rotnitzky A., Robins J. M. Adjusting for non-ignorable drop-out using semiparametric nonresponse models (with Discussion). J. Am. Statist. Assoc. (1999) 94:1096–46.[CrossRef][Web of Science]
Scheuren F. Multiple imputation: How it began and continues. Am. Statistician (2005) 59:315–9.[CrossRef]
Schluchter M. D. Methods for the analysis of informatively censored longitudinal data. Statist. Med. (1992) 11:1861–70.[CrossRef]
Titterington D. M., Smith A. F. M., Makov U. E. Statistical Analysis of Finite Mixture Distributions (1985) New York: Wiley.
Tsiatis A. A. Semiparametric Theory and Missing Data (2006) New York: Springer.
van der Laan M., Robins J. M. Unified Methods for Censored Longitudinal Data and Causality (2003) New York: Springer.
Verbeke G., Molenberghs G. Linear Mixed Models for Longitudinal Data (2000) New York: Springer.
Verbeke G., Molenberghs G., Thijs H., Lesaffre E., Kenward M. G. Sensitivity analysis for non-random dropout: A local influence approach. Biometrics (2001) 57:7–14.[CrossRef][Web of Science][Medline]
Vermunt J. K., Magidson J. Latent GOLD User's Guide (2000) Belmont, MA: Statistical Innovations, Inc.
Wu L., Hu X. J., Wu H. Joint inference for nonlinear mixed-effects models and time to event at the presence of missing data. Biostatistics (2008) 9:308–20.
Wu M. C., Carroll R. J. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. Biometrics (1988) 44:175–88.[CrossRef][Web of Science]
| ||||||||||||||||||||||||||||||||||||||||||||||||||