Skip Navigation


Biometrika Advance Access originally published online on January 31, 2008
Biometrika 2008 95(1):187-204; doi:10.1093/biomet/asm098
This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
95/1/187    most recent
asm098v1
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Bjørnstad, J. F.
Right arrow Articles by Ytterstad, E.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 Biometrika Trust

Articles

Two-stage sampling from a prediction point of view when the cluster sizes are unknown

Jan F. Bjørnstad

Division for Statistical Methods and Standards, Statistics Norway, P.O. Box 8131 Dep, N-0033 Oslo, Norway jab{at}ssb.no

Elinor Ytterstad

Department of Mathematics and Statistics, University of Tromsø, N-9037 Tromsø, Norway Elinor.Ytterstad{at}matnat.uit.no

Received for publication 1 March 2005. Revision received 1 June 2007.
   Abstract

We consider the problem of estimating the population total in two-stage cluster sampling when cluster sizes are known only for the sampled clusters, making use of a population model arising from a variance component model. The problem can be considered as one of predicting the unobserved part Z of the total, and the concept of predictive likelihood is studied. Prediction intervals and a predictor for the population total are derived for the normal case, based on predictive likelihood. For a more general distribution-free model, by application of an analysis of variance approach instead of maximum likelihood for parameter estimation, the predictor obtained from the predictive likelihood is shown to be approximately uniformly optimal for large sample size and large number of clusters, in the sense of uniformly minimizing the mean-squared error in a partially linear class of model-unbiased predictors. Three prediction intervals for Z based on three similar predictive likelihoods are studied. For a small number n0 of sampled clusters, they differ significantly, but for large n0, the three intervals are practically identical. Model-based and design-based coverage properties of the prediction intervals are studied based on a comprehensive simulation study. The simulation study indicates that for large sample sizes, the coverage measures achieve approximately the nominal level 1 – {alpha} and are slightly less than 1 – {alpha} for moderately large sample sizes. For small sample sizes, the coverage measures are about 1 – 2{alpha}, being raised to 1 – {alpha} for a modified interval based on the Formula distribution.

Key Words: Optimal predictor • Population model • Prediction interval • Predictive likelihood • Simulation • Survey sampling


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.