Skip Navigation


Biometrika Advance Access originally published online on May 14, 2007
Biometrika 2007 94(2):297-311; doi:10.1093/biomet/asm036
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
94/2/297    most recent
asm036v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Tian, L.
Right arrow Articles by Wei, L. J.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Copyright © 2007 Biometrika Trust

Articles

Model evaluation based on the sampling distribution of estimated absolute prediction error

Lu Tian

Department of Preventive Medicine, Northwestern University Medical School, 680 N. Lake Shore Drive, Chicago, Illinois 60611, U.S.A.

Tianxi Cai

Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, Massachusetts 02115, U.S.A.

Els Goetghebeur

Department of Applied Mathematics and Computer Science, Ghent University, Krijgsloan 281-S9 9000, Ghent, Belgium

L. J. Wei

Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, Massachusetts 02115, U.S.A.

lutian{at}northwestern.edu

tcai{at}hsph.harvard.edu

els.goetghebeur{at}ugent.be

wei{at}hsph.harvard.edu

Received for publication 1 November 2005. Revision received 1 October 2006.
   Abstract

The construction of a reliable, practically useful prediction rule for future responses is heavily dependent on the ‘adequacy’ of the fitted regression model. In this article, we consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion. This prediction error is easier to interpret than the average squared error and is equivalent to the misclassification error for a binary outcome. We show that the prediction error can be consistently estimated via the resubstitution and crossvalidation methods even when the fitted model is not correctly specified. Furthermore, we show that the resulting estimators are asymptotically normal. When the prediction rule is ‘nonsmooth’, the variance of the above normal distribution can be estimated well with a perturbation-resampling method. With two real examples and an extensive simulation study, we demonstrate that the interval estimates obtained from the above normal approximation for the prediction errors provide much more information about model adequacy than their point-estimate counterparts.

Key Words: 0.632 bootstrap • Bootstrap • K-fold crossvalidation • Model and variable selection • Perturbation-resampling • Prediction


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BiometrikaHome page
T. Cai, L. Tian, S. D. Solomon, and L.J. Wei
Predicting future responses based on possibly mis-specified working models
Biometrika, March 1, 2008; 95(1): 75 - 92.
[Abstract] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.