© 1981 by Biometrika Trust
Estimation of the odds ratio in the two-armed bandit problem
Department of Statistics, Temple University Philadelphia
School of Operations Research and Industrial Engineering, Cornell University Ithaca, New York
Asymptotically optimal sequential procedures are proposed for fixed width interval and for point estimation of the log odds ratio for two Bernoulli populations. The costs of observations can be different for the two populations and possibly dependent on the success probabilities. The interval estimation procedure is studied by simulation for two sampling cost structures of particular interest, namely when the goal is to minimize the total average sample size, and when the goal is to minimize the total expected number of failures before termination. Approximate expressions given for the savings in sampling cost using adaptive rather than pairwise sampling show that such savings can be substantial in some cases. In the final section, the multiarmed bandit problem is considered.
Key Words: Adaptive sampling Binomial population Odds ratio Sequential estimation Two-armed bandit