© 2001 by Biometrika Trust
An information matrix test for logistic regression models based on case-control data
1 Department of Mathematics, The University of Toledo, Toledo, Ohio 43606, U.S.Abzhang{at}math.utoledo.edu
We propose an information-matrix-based goodness-of-fit statistic to test the validity of the logistic regression model based on case-control data by extending the information matrix test of White (1982) for detecting one-sample parametric model misspecification to the semiparametric profile likelihood setting under a two-sample semiparametric model, which is equivalent to the assumed logistic regression model.The proposed test statistic requires a high-dimensional matrix inversion, but is otherwise easily computed and has an asymptotic chi-squared distribution. This test statistic is an alternative to the KolmogorovSmirnov-type statistic of Qin & Zhang (1997) and the chi-squared-type statistic of Zhang (1999) and needs neither to employ a bootstrap method to evaluate its critical values nor to group the combined sample data into a finite number of mutually exclusive categories even when the underlying population distribution is continuous. We demonstrate that the proposed test statistic and its asymptotic distribution may be obtained by fitting the prospective logistic regression model to case-control data. We present some results on simulation and on the analysis of three real datasets.
Key Words: Biased sampling problem; Case-control data; Chi-squared; Consistency; Fisher information matrix; MoorePenrose generalised inverse; Local alternative; Mixture sampling; Profile likelihood; Score derivative matrix; Squared score matrix
Received October 1999. Revised August 2000
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
H. D. Bondell Testing goodness-of-fit in logistic case-control studies Biometrika, June 1, 2007; 94(2): 487 - 495. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Guan and H. Zhao A semiparametric approach for marker gene selection based on gene expression data Bioinformatics, February 15, 2005; 21(4): 529 - 536. [Abstract] [Full Text] [PDF] |
||||

