Skip Navigation


Biometrika Advance Access originally published online on November 25, 2007
Biometrika 2008 95(1):75-92; doi:10.1093/biomet/asm078
This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
95/1/75    most recent
asm078v2
asm078v1
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Cai, T.
Right arrow Articles by Wei, L.J.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 Biometrika Trust

Articles

Predicting future responses based on possibly mis-specified working models

Tianxi Cai

Department of Biostatistics, Harvard University, Boston, Massachusetts 02115, U.S.A. tcai{at}hsph.harvard.edu

Lu Tian

Department of Preventive Medicine, Northwestern University, Chicago, Illinois 60611, U.S.A. lutian{at}northwestern.edu

Scott D. Solomon

Cardiovascular Division, Brigham & Women's Hospital, Boston, Massachusetts 02115, U.S.A. ssolomon{at}rics.bwh.harvard.edu

L.J. Wei

Department of Biostatistics, Harvard University, Boston, Massachusetts 02115, U.S.A. wei{at}sdac.harvard.edu

Received for publication 1 July 2006. Revision received 1 May 2007.
   Abstract

Under a general regression setting, we propose an optimal unconditional prediction procedure for future responses. The resulting prediction intervals or regions have a desirable average coverage level over a set of covariate vectors of interest. When the working model is not correctly specified, the traditional conditional prediction method is generally invalid. On the other hand, one can empirically calibrate the above unconditional procedure and also obtain its crossvalidated counterpart. Various large and small sample properties of these unconditional methods are examined analytically and numerically. We find that the K-fold crossvalidated procedure performs exceptionally well even for cases with rather small sample sizes. The new proposals are illustrated with two real examples, one with a continuous response and the other with a binary outcome.

Key Words: Heterogeneous regression • K-fold crossvalidation • Mis-specified regression model • Optimal prediction region • Prediction error rate


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.