Skip Navigation


Biometrika Advance Access originally published online on January 31, 2008
Biometrika 2008 95(1):187-204; doi:10.1093/biomet/asm098
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bjørnstad, J. F.
Right arrow Articles by Ytterstad, E.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 Biometrika Trust

Articles

Two-stage sampling from a prediction point of view when the cluster sizes are unknown

Jan F. Bjørnstad

Division for Statistical Methods and Standards, Statistics Norway, P.O. Box 8131 Dep, N-0033 Oslo, Norway jab{at}ssb.no

Elinor Ytterstad

Department of Mathematics and Statistics, University of Tromsø, N-9037 Tromsø, Norway Elinor.Ytterstad{at}matnat.uit.no

Received for publication 1 March 2005. Revision received 1 June 2007.

We consider the problem of estimating the population total in two-stage cluster sampling when cluster sizes are known only for the sampled clusters, making use of a population model arising from a variance component model. The problem can be considered as one of predicting the unobserved part Z of the total, and the concept of predictive likelihood is studied. Prediction intervals and a predictor for the population total are derived for the normal case, based on predictive likelihood. For a more general distribution-free model, by application of an analysis of variance approach instead of maximum likelihood for parameter estimation, the predictor obtained from the predictive likelihood is shown to be approximately uniformly optimal for large sample size and large number of clusters, in the sense of uniformly minimizing the mean-squared error in a partially linear class of model-unbiased predictors. Three prediction intervals for Z based on three similar predictive likelihoods are studied. For a small number n0 of sampled clusters, they differ significantly, but for large n0, the three intervals are practically identical. Model-based and design-based coverage properties of the prediction intervals are studied based on a comprehensive simulation study. The simulation study indicates that for large sample sizes, the coverage measures achieve approximately the nominal level 1 – {alpha} and are slightly less than 1 – {alpha} for moderately large sample sizes. For small sample sizes, the coverage measures are about 1 – 2{alpha}, being raised to 1 – {alpha} for a modified interval based on the Formula distribution.

Key Words: Optimal predictor • Population model • Prediction interval • Predictive likelihood • Simulation • Survey sampling



References

    Berger J. O., Wolpert R. L. The Likelihood Principle (1988) Vol. 6, 2nd ed. Hayward, CA: Institute of Mathematical Statistics. Lecture Notes – Monograph Series.

    Birnbaum A. On the foundations of statistical inference (with Discussion). J. Amer. Statist. Assoc. (1962) 57:269–306.[CrossRef]

    Bjørnstad J. F. Predictive likelihood: A review (with Discussion). Statist. Sci. (1990) 5:242–65.[CrossRef]

    Bjørnstad J. F. On the generalization of the likelihood function and the likelihood principle. J. Amer. Statist. Assoc. (1996) 91:791–806.[CrossRef]

    Bjørnstad J. F. Predictive likelihood. In: Encyclopedia of Statistical Sciences Update Volume 2—Kotz S., Read C. R., Banks D. L., eds. (1998) New York: Wiley. 539–45.

    Bolfarine H., Zacks S. Prediction Theory for Finite Populations (1992) New York: Springer.

    Butler R. W. Predictive likelihood inference with applications (with Discussion). J. R. Stat. Soc. B (1986) 48:1–38.

    Cassel C.-M., Särndal C.-E., Wretman J. Foundations of Inference in Survey Sampling. (1977) New York: Wiley.

    Hinkley D. V. Predictive likelihood. Ann. Statist. (1979) 7:718–28. Correction (1980), 8, 694.[CrossRef]

    Kelly E. J., Cumberland W. G. Prediction theory approach to multistage sampling when cluster sizes are unknown. J. Offic. Statist. (1990) 6:437–49.

    Mathiasen P. E. Prediction functions. Scand. J. Statist. (1979) 6:1–21.

    Royall R. M. The prediction approach to robust variance estimation in two-stage cluster sampling. J. Amer. Statist. Assoc. (1986) 81:119–23.[CrossRef]

    Särndal C.- E., Swensson B., Wretman J. Model Assisted Survey Sampling. (1992) New York: Springer.

    Thomsen I., Tesfu D., Binder D. A. Estimation of design effects and intraclass correlations when using outdated measures of size. In: Int. Statist. Rev. (1986) 54:343–9.

    Thomsen I., Tesfu D. On the use of models in sampling from finite populations. Handbook of Statistics—Krishnaiah P. R., Rao C. R., eds. (1988) 6. New York: North Holland. 369–97.[CrossRef]

    Valliant R., Dorfman A. H., Royall R. M. Finite Population Sampling and Inference. A Prediction Approach (2000) New York: Wiley.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bjørnstad, J. F.
Right arrow Articles by Ytterstad, E.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?