Biometrika Advance Access originally published online on February 28, 2007
Biometrika 2007 94(1):49-60; doi:10.1093/biomet/asm001
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2007 Biometrika Trust
Articles |
Fuzzy p-values in latent variable problems
Department of Statistics, University of Washington, Seattle, Washington 98195-4322, U.S.A.
School of Statistics, University of Minnesota, 313 Ford Hall, 224 Church Street S.E., Minneapolis, Minnesota 55455, U.S.A.
thompson{at}stat.washington.edu
charlie{at}stat.umn.edu
Received for publication 1 June 2005.
Revision received 1 May 2006.
| Abstract |
|---|
We consider the problem of testing a statistical hypothesis where the scientifically meaningful test statistic is a function of latent variables. In particular, we consider detection of genetic linkage, where the latent variables are patterns of inheritance at specific genome locations. Introduced by Geyer & Meeden (2005), fuzzy p-values are random variables, described by their probability distributions, that are interpreted as p-values. For latent variable problems, we introduce the notion of a fuzzy p-value as having the conditional distribution of the latent p-value given the observed data, where the latent p-value is the random variable that would be the p-value if the latent variables were observed.
The fuzzy p-value provides an exact test using two sets of simulations of the latent variables under the null hypothesis, one unconditional and the other conditional on the observed data. It provides not only an expression of the strength of the evidence against the null hypothesis but also an expression of the uncertainty in that expression owing to lack of knowledge of the latent variables. We illustrate these features with an example of simulated data mimicking a real example of the detection of genetic linkage.
Key Words: allele sharing genetic linkage genetic mapping identity by descent Markov chain Monte Carlo randomized test