Articles |
Model diagnostic tests for selecting informative correlation structure in correlated data
Department of Statistics, University of Illinois at Urban-Champaign, Champaign, Illinois 61820, U.S.A. anniequ{at}illinois.edu
Department of Biostatistics, University of Texas, M.D. Anderson Cancer Center, Houston, Texas 77030, U.S.A. jjlee{at}mdanderson.org
Department of Statistics, The Pennsylvania State University, Pennsylvania 16802, U.S.A. bgl{at}psu.edu
Received for publication 1 August 2006. Revision received 1 March 2008.
In the generalized method of moments approach to longitudinal data analysis, unbiased estimating functions can be constructed to incorporate both the marginal mean and the correlation structure of the data. Increasing the number of parameters in the correlation structure corresponds to increasing the number of estimating functions. Thus, building a correlation model is equivalent to selecting estimating functions. This paper proposes a chi-squared test to choose informative unbiased estimating functions. We show that this methodology is useful for identifying which source of correlation it is important to incorporate when there are multiple possible sources of correlation. This method can also be applied to determine the optimal working correlation for the generalized estimating equation approach.
Key Words: Cancer prevention Chi-squared test Generalized estimating equation Generalized method of moments Goodness-of-fit test Information matrix test Model selection Quadratic inference function Working correlation
References
-
Chaganty N. R., Joe H. Efficiency of generalized estimating equations for binary responses. J. R. Statist. Soc. B (2004) 66:851–60.[CrossRef]
Davison A. C., Hinkley D. V. Bootstrap Methods and Their Application (1997) Cambridge: Cambridge University Press.
Diggle P. J., Heagerty P., Liang K.-Y., Zeger S. L. Analysis of Longitudinal Data. (2002) 2nd ed. Oxford: Oxford University Press.
Gabriel K. R. Ante-dependent analysis of an ordered set of variables. Ann. Math. Statist. (1962) 33:201–12.
Hansen L. Large sample properties of generalized method of moments estimators. Econometrica (1982) 50:1029–54.[CrossRef][Web of Science]
Harville D. A. Matrix Algebra from a Statistician's Perspective (1999) New York: Springer.
Huang J. Z., Liu N., Pourahmadi M., Liu L. Covariance selection and estimation via penalised normal likelihood. Biometrika (2006) 93:85–98.
Khuri F. R., Lee J. S., Lippman S. M., Lee J. J., Kalapurakal S., Yu R., Ro J. Y., Morice R. C., Hong W. K., Hittelman W. N. Modulation of proliferating cell nuclear antigen in the bronchial epithelium of smokers. Cancer Epidemiol. Biomarkers Prev. (2001) 10:311–8.
Lefkopoulou M., Moore D., Ryan L. The analysis of multiple binary outcomes: An application to rodent teratology experiments. J. Am. Statist. Assoc. (1989) 84:810–5.[CrossRef]
Liang K. Y., Zeger S. L. Longitudinal data analysis using generalised linear models. Biometrika (1986) 73:12–22.
Lindsay B. G. Conditional score functions: some optimality results. Biometrika (1982) 69:503–12.
Lindsay B. G., Qu A. Inference functions and quadratic score tests. Statist. Sci. (2003) 18:394–410.
Lipsitz S. R., Fitzmaurice G. M., Orav E. J., Laird N. M. Performance of generalized estimating equations in practical situations. Biometrics (1994) 50:270–8.[CrossRef][Web of Science][Medline]
Newey W. K., West K. D. Hypothesis testing with efficient method of moments testing. Int. Econ. Rev. (1987) 28:777–87.
Pan J., Mackenzie G. Model selection for joint mean-covariance structures in longitudinal studies. Biometrika (2003) 90:239–44.
Pan W., Connett J. E. Selecting the working correlation structure in generalized estimating equations with application to the lung health study. Statist. Sinica (2002) 12:475–90.
Park C. G., Park T., Shin D. W. A simple method for generating correlated binary variates. Am. Statistician (1996) 60:306–10.
Pinheiro J. D., Bates D. M. Unconstrained parameterizations for variance-covariance matrices. Statist. Comp. (1996) 6:289–96.[CrossRef]
Pourahmadi M. Joint mean-covariance models with applications to longitudinal data: Unconstrained parameterisation. Biometrika (1999) 86:677–90.
Pourahmadi M. Maximum likelihood estimation of generalised linear models for multivariate normal covariance matrix. Biometrika (2000) 87:425–35.
Qu A., Li R. Quadratic inference functions for varying coefficient models with longitudinal data. Biometrics (2006) 62:379–91.[Medline]
Qu A., Lindsay B. G. Building adaptive estimating equations when inverse of covariance estimation is difficult. J. R. Statist. Soc. B (2003) 65:127–42.
Qu A., Lindsay B. G., Li B. Improving generalised estimating equations using quadratic inference functions. Biometrika (2000) 87:823–36.
Shults J., Morrow A. Use of quasi-least squares to adjust for two levels of correlation. Biometrics (2002) 58:521–30.[CrossRef][Medline]
Small C. G., McLeish D. L. Hilbert Space Methods in Probability and Statistical Inference (1994) New York: Wiley.
Stoner J. A., Leroux B. G. Analysis of clustered data: A combined estimating equations approach. Biometrika (2002) 89:567–78.
Wang Y.-G., Carey V. Working correlation structure misspecification, estimation and covariate design: Implications for generalised estimating equations performance. Biometrika (2003) 90:29–41.
White H. Maximum likelihood estimation of misspecified models. Econometrica (1982) 50:1–25.[CrossRef][Web of Science]
Ye H., Pan J. Modelling of covariance structures in generalised estimating equations for longitudinal data. Biometrika (2006) 93:927–41.
| ||||||||||||||||||||||||||||||||||||||||||||||||||