Article |
A generalized Dantzig selector with shrinkage tuning
Marshall School of Business, University of Southern California, Los Angeles, California 90089, U.S.A. gareth{at}usc.edu radchenk{at}marshall.usc.edu
Received for publication 1 July 2007. Revision received 1 August 2008.
The Dantzig selector performs variable selection and model fitting in linear regression. It uses an L1 penalty to shrink the regression coefficients towards zero, in a similar fashion to the lasso. While both the lasso and Dantzig selector potentially do a good job of selecting the correct variables, they tend to overshrink the final coefficients. This results in an unfortunate trade-off. One can either select a high shrinkage tuning parameter that produces an accurate model but poor coefficient estimates or a low shrinkage parameter that produces more accurate coefficients but includes many irrelevant variables. We extend the Dantzig selector to fit generalized linear models while eliminating overshrinkage of the coefficient estimates, and develop a computationally efficient algorithm, similar in nature to least angle regression, to compute the entire path of coefficient estimates. A simulation study illustrates the advantages of our approach relative to others. We apply the methodology to two datasets.
Key Words: Dantzig selector DASSO Double Dantzig Generalized linear model Interpolated Dantzig Lasso Ridge Dantzig Variable selection
References
-
Candès E., Tao T. The Dantzig selector: statistical estimation whe p is much larger than n (with discussion). Ann.Statist. (2007) 35:2313–51.
Chen S., Donoho D., Saunders M. Atomic decomposition by basis pursuit. SIAM J. Sci. Comp. (1998) 20:33–61.[CrossRef]
Efron B., Hastie T., Johnston I., Tibshirani R. Least angle regression (with discussion). Ann. Statist. (2004) 32:407–51.[CrossRef]
Efron B., Hastie T., Tibshirani R. Discussion of the "Dantzig selector". Ann. Statist. (2007) 35:2358–64.[CrossRef]
Efron B., Tibshirani R. An Introduction to the Bootstrap (1993) London: Chapman & Hall.
Fan J., Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Statist. Assoc. (2001) 96:1348–60.[CrossRef][Web of Science]
Genkin A., Lewis D., Madigan D. Large-scale bayesian logistic regression for text categorization. Technometrics (2006) 49:291–304.[CrossRef][Web of Science]
Hastie T. J., Tibshirani R. J., Friedman J. The Elements of Statistical Learning (2001) New York: Springer.
James G. M., Radchenko P., Lv J. DASSO: connections between the dantzig selector and lasso. J. R. Statist. Soc. (2009) B 71:127–142.[CrossRef]
McCullagh P., Nelder J. Generalized Linear Models (1989) 2nd ed. London: Chapman & Hall.
Meinshausen N. Relaxed lasso. Comp. Statist. Data Anal. (2007) 52:374–93.[CrossRef]
Meinshausen N., Rocha G., Yu B. A tale of three cousins: Lasso, l2 boosting and Dantzig; a discussion on the Dantzig selector: statistical estimation when p is much larger than n. Ann. Statist. (2008) 35:2373–84.[CrossRef]
Park M., Hastie T. An L1-regulation path algorithm for generalized linear models. J. R. Statist. Soc. (2007) B 69:659–77.[CrossRef]
Radchenko P., James G. M. Variable inclusion and shrinkage algorithms. J. Am. Statist. Assoc. (2008) 103:1304–15.[CrossRef][Web of Science]
Shevade S., Keerthi S. A simple and efficient algorithm for gene selection using sparse logistic regression. Bioinformatics (2003) 19:2246–53.
Tibshirani R. Regression shrinkage and selection via the lasso. J. R. Statist. Soc. (1996) B 58:267–88.
Zhao P., Yu B. Stagewise Lasso. J. Mach. Learn. Res (2007) 8:2701–26.[Web of Science]
Zou H. The adaptive lasso and its oracle properties. J. Am. Statist. Assoc. (2006) 101:1418–29.[CrossRef][Web of Science]
Zou H., Hastie T. Regularization and variable selection via the elastic net. J. R. Statist. Soc. (2005) B 67:301–20.[CrossRef]
| ||||||||||||||||||||||||||||||||||||||||||||||||