Biometrika Advance Access originally published online on January 31, 2008
Biometrika 2008 95(1):187-204; doi:10.1093/biomet/asm098
Articles |
Two-stage sampling from a prediction point of view when the cluster sizes are unknown
Division for Statistical Methods and Standards, Statistics Norway, P.O. Box 8131 Dep, N-0033 Oslo, Norway jab{at}ssb.no
Department of Mathematics and Statistics, University of Tromsø, N-9037 Tromsø, Norway Elinor.Ytterstad{at}matnat.uit.no
Received for publication 1 March 2005. Revision received 1 June 2007.
We consider the problem of estimating the population total in two-stage cluster sampling when cluster sizes are known only for the sampled clusters, making use of a population model arising from a variance component model. The problem can be considered as one of predicting the unobserved part Z of the total, and the concept of predictive likelihood is studied. Prediction intervals and a predictor for the population total are derived for the normal case, based on predictive likelihood. For a more general distribution-free model, by application of an analysis of variance approach instead of maximum likelihood for parameter estimation, the predictor obtained from the predictive likelihood is shown to be approximately uniformly optimal for large sample size and large number of clusters, in the sense of uniformly minimizing the mean-squared error in a partially linear class of model-unbiased predictors. Three prediction intervals for Z based on three similar predictive likelihoods are studied. For a small number n0 of sampled clusters, they differ significantly, but for large n0, the three intervals are practically identical. Model-based and design-based coverage properties of the prediction intervals are studied based on a comprehensive simulation study. The simulation study indicates that for large sample sizes, the coverage measures achieve approximately the nominal level 1 –
and are slightly less than 1 –
for moderately large sample sizes. For small sample sizes, the coverage measures are about 1 – 2
, being raised to 1 –
for a modified interval based on the
distribution.
Key Words: Optimal predictor Population model Prediction interval Predictive likelihood Simulation Survey sampling
References
-
Berger J. O., Wolpert R. L. The Likelihood Principle (1988) Vol. 6, 2nd ed. Hayward, CA: Institute of Mathematical Statistics. Lecture Notes – Monograph Series.
Birnbaum A. On the foundations of statistical inference (with Discussion). J. Amer. Statist. Assoc. (1962) 57:269–306.[CrossRef]
Bjørnstad J. F. Predictive likelihood: A review (with Discussion). Statist. Sci. (1990) 5:242–65.[CrossRef]
Bjørnstad J. F. On the generalization of the likelihood function and the likelihood principle. J. Amer. Statist. Assoc. (1996) 91:791–806.[CrossRef]
Bjørnstad J. F. Predictive likelihood. In: Encyclopedia of Statistical Sciences Update Volume 2—Kotz S., Read C. R., Banks D. L., eds. (1998) New York: Wiley. 539–45.
Bolfarine H., Zacks S. Prediction Theory for Finite Populations (1992) New York: Springer.
Butler R. W. Predictive likelihood inference with applications (with Discussion). J. R. Stat. Soc. B (1986) 48:1–38.
Cassel C.-M., Särndal C.-E., Wretman J. Foundations of Inference in Survey Sampling. (1977) New York: Wiley.
Hinkley D. V. Predictive likelihood. Ann. Statist. (1979) 7:718–28. Correction (1980), 8, 694.[CrossRef]
Kelly E. J., Cumberland W. G. Prediction theory approach to multistage sampling when cluster sizes are unknown. J. Offic. Statist. (1990) 6:437–49.
Mathiasen P. E. Prediction functions. Scand. J. Statist. (1979) 6:1–21.
Royall R. M. The prediction approach to robust variance estimation in two-stage cluster sampling. J. Amer. Statist. Assoc. (1986) 81:119–23.[CrossRef]
Särndal C.- E., Swensson B., Wretman J. Model Assisted Survey Sampling. (1992) New York: Springer.
Thomsen I., Tesfu D., Binder D. A. Estimation of design effects and intraclass correlations when using outdated measures of size. In: Int. Statist. Rev. (1986) 54:343–9.
Thomsen I., Tesfu D. On the use of models in sampling from finite populations. Handbook of Statistics—Krishnaiah P. R., Rao C. R., eds. (1988) 6. New York: North Holland. 369–97.[CrossRef]
Valliant R., Dorfman A. H., Royall R. M. Finite Population Sampling and Inference. A Prediction Approach (2000) New York: Wiley.
| ||||||||||||||||||||||||||||||||||||||||||||||||||