Biometrika Advance Access originally published online on November 25, 2007
Biometrika 2008 95(1):93-106; doi:10.1093/biomet/asm079
Articles |
Flexible generalized t-link models for binary response data
Department of Statistics, University of Connecticut, 215 Glenbrook Road, U-4120, Storrs, Connecticut 06269, U.S.A. sdkim{at}stat.uconn.edu mhchen{at}stat.uconn.edu dey{at}stat.uconn.edu
Received for publication 1 September 2006. Revision received 1 May 2007.
A critical issue in modelling binary response data is the choice of the links. We introduce a new link based on the generalized t-distribution. There are two parameters in the generalized t-link: one parameter purely controls the heaviness of the tails of the link and the second parameter controls the scale of the link. Two major advantages are offered by the generalized t-links. First, a symmetric generalized t-link with an unknown shape parameter is much more identifiable than a Student t-link with unknown degrees of freedom and a known scale parameter. Secondly, skewed generalized t-links with both unknown shape and scale parameters provide much more flexible and improved skewed link regression models than the existing skewed links. Various theoretical properties and attractive features of the proposed links are examined and explored in detail. An efficient Markov chain Monte Carlo algorithm is developed for sampling from the posterior distribution. The deviance information criterion measure is used for guiding the choice of links. The proposed methodology is motivated and illustrated by prostate cancer data.
Key Words: Latent variable; Logistic regression Markov chain Monte Carlo Mixed-effects model Probit link Posterior distribution Robit link.
References
-
Abramowitz M., Stegun I. A. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables (1972) New York: Dover Publications, Inc.
Albert J. H., Chib S. Bayesian analysis of binary and polychotomous response data. J. Am. Statist. Assoc. (1993) 88:669–79.[CrossRef][ISI]
Aranda-Ordaz F. J. On two families of transformations to additivity for binary response data. Biometrika (1981) 68:357–64.
Arellano-Valle R. B., Bolfarine H. On some characterisations of the t-distribution. Statist. Prob. Lett. (1995) 25:79–85.[CrossRef]
Chen M.-H. Skewed link models for categorical response data. In: Skew-Elliptical Distributions and Their Applications: A Journey Beyond Normality—Genton M. G., ed. (2004) Boca Raton, FL: Chapman and Hall/CRC. 131–51.
Chen M.-H., Dey D. K. Bayesian modeling of correlated binary responses via scale mixture of multivariate normal link functions. Sankhyã A (1998) 60:322–43.
Chen M.-H., Dey D. K., Shao Q.-M. A new skewed link model for dichotomous quantal response data. J. Am. Statist. Assoc. (1999) 94:1172–86.[CrossRef][ISI]
Chen M.-H., Shao Q.-M. Propriety of posterior distribution for dichotomous quantal response models with general link functions. Proc. Am. Math. Soc. (2001) 129:293–302.[CrossRef]
Cowles M. K., Carlin B. P. Markov chain Monte Carlo convergence diagnostics: A comparative review. J. Am. Statist. Assoc. (1996) 91:883–904.[CrossRef][ISI]
Czado C., Santner T. J. The effect of link mis-specification on binary regression inference. J. Statist. Plan. Infer. (1992) 33:213–31.[CrossRef]
D'Amico A. V., Whittington R., Malkowicz S. B., Cote K., Loffredo M., Schultz D., Chen M.-H., Tomaszewski J. E., Renshaw A. A., Wein A., Richie J. P. Biochemical outcome following radical prostatectomy or external beam radiation therapy for clinically localised prostate cancer in the PSA era. Cancer (2002) 95:281–6.[CrossRef][ISI][Medline]
Gelman A., Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models (2007) Cambridge: Cambridge University Press.
Guerrero V. M., Johnson R. A. Use of the Box-Cox transformation with binary response models. Biometrika (1982) 69:309–14.
Hastings W. K. Monte Carlo sampling methods using Markov chains and their applications. Biometrika (1970) 57:97–109.
Lange K. L., Little R. J. A., Taylor J. M. G. Robust statistical modeling using the t distribution. J. Am. Statist. Assoc. (1989) 84:881–96.[CrossRef][ISI]
Liu C. Robit regression: A simple robust alternative to logistic and probit regression. In: Applied Bayesian Modeling and Causal Inference—Gelman A., Meng X.-L., eds. (2004) New York: Wiley. 227–38.
Liu J. S. The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problem. J. Am. Statist. Assoc. (1994) 89:958–66.[CrossRef][ISI]
Liu J. S., Sabatti C. Generalised Gibbs sampler and multigrid Monte Carlo for Bayesian computation. Biometrika (2000) 87:353–69.
Morgan B. J. T. Observations on quantitative analysis. Biometrics (1983) 39:879–86.[CrossRef][ISI]
Pregibon D. Goodness of link tests for generalised linear models. Appl. Statist. (1980) 29:15–24.[CrossRef]
Roy V., Hobert J. P. Convergence rates and asymptotic standard errors for MCMC algorithms for Bayesian probit regression. J. R. Statist. Soc. B (2007) 69:607–23.[CrossRef]
Spiegelhalter D. J., Best N. G., Carlin B. P., van der Linde A. Bayesian measures of model complexity and fit (with Discussion). J. R. Statist. Soc. B (2002) 64:583–639.[CrossRef]
Stukel T. Generalised logistic models. J. Am. Statist. Assoc. (1988) 83:426–31.[CrossRef][ISI]
Whittemore A. S. Transformations to linearity in binary regression. SIAM J. Appl. Math. (1983) 43:703–10.[CrossRef]
| ||||||||||||||||||||||||||||||||||||||||||||||||||