Biometrika Advance Access published online on May 14, 2007
Biometrika, doi:10.1093/biomet/asm036
| ||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2007 Biometrika Trust
Article |
Model evaluation based on the sampling distribution of estimated absolute prediction error
Department of Preventive Medicine, Northwestern University Medical School, 680 N. Lake Shore Drive, Chicago, Illinois 60611, U.S.A.
Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, Massachusetts 02115, U.S.A.
Department of Applied Mathematics and Computer Science, Ghent University, Krijgsloan 281-S9 9000, Ghent, Belgium
Department of Biostatistics, Harvard University, 655 Huntington Avenue, Boston, Massachusetts 02115, U.S.A.
lutian{at}northwestern.edu
tcai{at}hsph.harvard.edu
els.goetghebeur{at}ugent.be
wei{at}hsph.harvard.edu
Received for publication 1 November 2005.
Revision received 1 October 2006.
| Abstract |
|---|
The construction of a reliable, practically useful prediction rule for future responses is heavily dependent on the adequacy of the fitted regression model. In this article, we consider the absolute prediction error, the expected value of the absolute difference between the future and predicted responses, as the model evaluation criterion. This prediction error is easier to interpret than the average squared error and is equivalent to the misclassification error for a binary outcome. We show that the prediction error can be consistently estimated via the resubstitution and crossvalidation methods even when the fitted model is not correctly specified. Furthermore, we show that the resulting estimators are asymptotically normal. When the prediction rule is nonsmooth, the variance of the above normal distribution can be estimated well with a perturbation-resampling method. With two real examples and an extensive simulation study, we demonstrate that the interval estimates obtained from the above normal approximation for the prediction errors provide much more information about model adequacy than their point-estimate counterparts.
Key Words: 0.632 bootstrap Bootstrap K-fold crossvalidation Model and variable selection Perturbation-resampling Prediction
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. Cai, L. Tian, S. D. Solomon, and L.J. Wei Predicting future responses based on possibly mis-specified working models Biometrika, March 1, 2008; 95(1): 75 - 92. [Abstract] [PDF] |
||||
