Articles |
Model checking in regression via dimension reduction
Department of Statistics and Applied Probability, National University of Singapore, 117546, Singapore staxyc{at}stat.nus.edu.sg
Received for publication 1 December 2006. Revision received 1 March 2008.
Lack-of-fit checking for parametric and semiparametric models is essential in reducing misspecification. The efficiency of most existing model-checking methods drops rapidly as the dimension of the covariates increases. We propose to check a model by projecting the fitted residuals along a direction that adapts to the systematic departure of the residuals from the desired pattern. Consistency of the method is proved for parametric and semiparametric regression models. A bootstrap implementation is also discussed. Simulation comparisons with several existing methods are made, suggesting that the proposed methods are more efficient than the existing methods when the dimension increases. Air pollution data from Chicago are used to illustrate the procedure.
Key Words: Bootstrap Crossvalidation Goodness-of-fit Kernel smoothing Semiparametric model Single-index model
References
-
Aerts M., Claeskens G., Hart J. D. Testing lack of fit in multiple regression. Biometrika (2000) 87:405–24.
Bowman A. W., Azzalini A. On the use of nonparametric regression for checking linear relationships. J. R. Statist. Soc. (1993) B 55:549–57.
Carroll R. J., Fan J., Gijbels I., Wand M. P. Generalized partially linear single-index models. J. Am. Statist. Assoc. (1997) 92:477–89.[CrossRef][Web of Science]
Chen R., Liu J. S., Tsay R. S. Additivity tests for nonlinear autoregression. Biometrika (1995) 82:369–83.
Chen R., Tsay R. S. Functional-coefficient autoregressive models. J. Am. Statist. Assoc. (1993) 88:298–308.[CrossRef][Web of Science]
Cheng B., Tong H. On residual sums of squares in non-parametric autoregression. Stoch. Proces. Appl. (1993) 48:157–74.[CrossRef]
Dette H. A consistent test for the functional form of a regression based on a difference of variance estimators. Ann. Statist. (1999) 27:1012–40.[CrossRef]
Eubank R. L., LaRiccia V. M. Asymptotic comparison of Cramer-von Mises and nonparametric function estimation techniques for testing goodness-of-fit. Ann. Statist. (1992) 20:2071–86.[CrossRef]
Fan J., Huang L. Goodness-of-fit test for parametric regression models. J. Am. Statist. Assoc. (2001) 96:640–52.[CrossRef][Web of Science]
Fan J., Jiang J. Nonparametric inference for additive models. J. Am. Statist. Assoc. (2005) 100:890–907.[CrossRef][Web of Science]
Fan J., Zhang C. M., Zhang J. Generalized likelihood ratio statistics and Wilks phenomenon. Ann. Statist. (2001) 29:153–93.[CrossRef]
Fan J., Zhang W. Simultaneous confidence bands and hypothesis testing in varying-coefficient models. Scand. J. Statist. (2000) 27:715–31.[CrossRef]
Fan Y., Li Q. Consistent model specification tests: Omitted variables and semiparametric functional forms. Econometrica (1996) 64:865–90.[CrossRef][Web of Science]
Friedman J. H., Stuetzle W. Projection pursuit regression. J. Am. Statist. Assoc. (1981) 76:817–23.[CrossRef][Web of Science]
Gozalo P. L., Linton O. B. Testing additivity in generalized nonparametric regression models with estimated parameters. J. Economet. (2001) 104:1–48.[CrossRef]
Härdle W., Hall P., Ichimura H. Optimal smoothing in single-index models. Ann. Statist. (1993) 21:157–78.[CrossRef]
Härdle W., Mammen E. Comparing nonparametric versus parametric regression fits. Ann. Statist. (1993) 21:1926–47.[CrossRef]
Härdle W., Stoker T. M. Investigating smooth multiple regression by the method of average derivatives. J. Am. Statist. Assoc. (1989) 84:986–95.[CrossRef][Web of Science]
Hart J. D. Nonparametric Smoothing and Lack-of-Fit Tests (1997) New York: Springer.
Hastie T. J., Tibshirani J. R. Generalized Additive Model (1990) London: Chapman & Hall.
Hastie T. J., Tibshirani R. J. Varying-coefficient models. J. R. Statist. Soc. (1993) B 55:757–96.
Hristache M., Juditski A., Spokoiny V. Direct estimation of the index coefficients in a single-index model. Ann. Statist. (2001) 29:595–623.[CrossRef]
Horowitz J. L., Mammen E. Nonparametric estimation of an additive model with a link function. Ann. Statist. (2004) 32:2412–43.[CrossRef]
Ichimura H. Semiparametric least squares (SLS) and weighted SLS estimation of single-index models. J. Economet. (1993) 58:71–120.[CrossRef]
Jones L. K. On a conjecture of Huber concerning the convergence of projection pursuit regression. Ann. Statist. (1987) 15:880–2.[CrossRef]
Mammen E. When Does Bootstrap Work? Asymptotic Results and Simulations (1992) New York: Springer. Lecture Notes in Statistics 77.
Morris J. S., Wang N., Lupton J. R., Chapkin R. S., Turner N. D., Hong M. Y., Carroll R. J. Parametric and nonparametric methods for understanding the relationship between carcinogen-induced DNA adduct levels in distal and proximal regions of the colon. J. Am. Statist. Assoc. (2001) 96:816–26.[CrossRef][Web of Science]
Samarov A. M. Exploring regression structure using nonparametric functional estimation. J. Am. Statist. Assoc. (1993) 88:836–47.[CrossRef][Web of Science]
Scott D. W. Multivariate Density Estimation: Theory, Practice, and Visualization (1992) New York: John Wiley & Sons.
Shao J. Linear model selection by cross-validation. J. Am. Statist. Assoc. (1993) 88:486–94.[CrossRef][Web of Science][Medline]
Silverman B. W. Density Estimation for Statistics and Data Analysis (1986) London: Chapman and Hall.
Speckman P. Kernel smoothing in partial linear models. J. R. Statist. Soc. (1988) B 50:413–36.
Sperlich S., Tjøstheim D., Yang L. Nonparametric estimation and testing of interaction in additive models. Economet. Theory (2002) 18:197–251.
Stone M. Cross-validatory choice and assessment of statistical prediction (with Discussion. J. R. Statist. Soc. (1974) B 36:111–47.
Stute W., Manteiga G., Quindimil M. P. Bootstrap approximations in model checks for regression. J. Am. Statist. Assoc. (1998) 93:141–9.[CrossRef][Web of Science]
Su J. Q., Wei L. J. A lack-of-fit test for the mean function in a generalized linear model. J. Am. Statist. Assoc. (1991) 86:420–6.[CrossRef][Web of Science]
Tjøstheim D., Auestad B. H. Nonparametric identification of nonlinear time series: selecting significant lags. J. Am. Statist. Assoc. (1994) 89:1410–9.[CrossRef][Web of Science]
Wang L., Yang L. Spline-backfitted kernel smoothing of nonlinear additive autoregression model. Ann. Statist. (2007) 35:2474–503.[CrossRef]
World Health Organization. (2003) Reports of a WHO/HEI working group. Bonn, Switzerland: World Health Organization.
Xia Y. Asymptotic distributions for two estimators of the single-index model. Economet. Theory (2006) 22:1112–37.
Xia Y., Tong H., Li W. K. On extended partially linear single-index models. Biometrika (1999) 86:831–42.
Xia Y., Tong H., Li W. K., Zhu L. An adaptive estimation of dimension reduction space (with Discussion). J. R. Statist. Soc. (2002) B 64:363–410.
Yang L., Huang J. Identification of nonlinear additive autoregressive models. J. R. Statist. Soc. (2004) B 66:463–77.[CrossRef]
Yao Q., Tong H. On subset selection in non-parametric stochastic regression. Statist. Sinica (1994) 4:51–70.
Yin X., Cook R. D. Direction estimation in single-index regressions. Biometrika (2005) 92:371–84.
Zhang C. M. Calibrating the degrees of freedom for automatic data smoothing and effective curve checking. J. Am. Statist. Assoc. (2003) 98:609–28.[CrossRef][Web of Science]
| ||||||||||||||||||||||||||||||||||||||||||||||||