© 1987 by Biometrika Trust
A multinomial Bayesian approach to the estimation of population and vocabulary size
Econometric Institute, Erasmus University Rotterdam 3000 DR Rotterdam, The Netherlands
We approach estimation of the size of a population or a vocabulary through a Bayesian analysis of the multinomial distribution. We view the sample as being generated from such a distribution with an unknown number of cells and unknown cell probabilities, and develop a Bayesian procedure to estimate the number of cells and the coverage of the sample. The prior distribution of the number of cells is arbitrary. Given that number, the cell probabilities are assumed to follow a symmetric Dirichlet prior. A two-stage approach is developed for use when the flattening constant of the latter prior cannot be specified in advance. Our procedures are applied to samples of butterflies, insect species and alleles, to the works of Shakespeare and Joyce, and to Eldridge's sample of English words.
Key Words: Bayesian inference Generalized multinomial distribution Population size Vocabulary size
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Lijoi, R. H. Mena, and I. Prunster Bayesian Nonparametric Estimation of the Probability of Discovering New Species Biometrika, December 1, 2007; 94(4): 769 - 786. [Abstract] [PDF] |
||||
