Mean estimating equation approach to analysing cluster-correlated data with nonignorable cluster sizes
1 Business Survey Methods Division, Statistics Canada, Ottawa, Ontario, K1A 0T6, Canada emmanuel.benhin{at}statcan.ca, 2 School of Mathematics and Statistics, Carleton University, Ottawa, Ontario, K1S 5B6, Canada jrao{at}math.carleton.ca, 3 Department of Statistics, University of Auckland, Auckland, New Zealand scott{at}stat.auckland.ac.nz
Most methods for analysing cluster-correlated biological data implicitly assume the ignorability of cluster sizes. When this assumption fails, the resulting inferences may be asymptotically invalid. Hoffman et al. (2001) proposed a simple but computationally intensive method, based on a large number of within-cluster resamples and associated separate estimating equations, that leads to asymptotically valid inferences whether the cluster sizes are ignorable or not. We study a simple method, based on a single inverse cluster size-weighted estimating equation, that avoids resampling and yet leads to asymptotically valid inferences. Simulation results are presented to assess the performance of the proposed method. We also propose Wald tests for ignorability of cluster sizes.
Key Words: Generalised estimating equation; Logistic regression; Repeated subsampling; Wald test
Received September 2002. Revised November 2004.