Premium
Least Angle Regression and LASSO for Large Datasets
Author(s) -
Fraley Chris,
Hesterberg Tim
Publication year - 2009
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10021
Subject(s) - interpretability , lasso (programming language) , regression , residual , feature selection , computer science , regression analysis , segmented regression , computation , stability (learning theory) , partial least squares regression , elastic net regularization , mathematics , linear regression , transformation (genetics) , statistics , data mining , algorithm , polynomial regression , artificial intelligence , machine learning , biochemistry , chemistry , gene , world wide web
Least angle regression and LASSO (ℓ 1 ‐penalized regression) offer a number of advantages in variable selection applications over procedures such as stepwise or ridge regression, including prediction accuracy, stability, and interpretability. We discuss formulations of these algorithms that extend to datasets in which the number of observations could be so large that it would not be possible to access the matrix of predictors as a unit in computations. Our methods require a single pass through the data for orthogonal transformation, effectively reducing the dimension of the computations required to obtain the regression coefficients and residual sum of squares to the number of predictors, rather than the number of observations. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 1: 000‐000, 2009