z-logo
Premium
Structured Ordinary Least Squares: A Sufficient Dimension Reduction approach for regressions with partitioned predictors and heterogeneous units
Author(s) -
Liu Yang,
Chiaromonte Francesca,
Li Bing
Publication year - 2017
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.12579
Subject(s) - ordinary least squares , merge (version control) , computer science , dimension (graph theory) , reduction (mathematics) , dimensionality reduction , computation , sufficient dimension reduction , r package , least squares function approximation , data mining , algorithm , mathematics , machine learning , statistics , computational science , parallel computing , geometry , estimator , pure mathematics
Summary In many scientific and engineering fields, advanced experimental and computing technologies are producing data that are not just high dimensional, but also internally structured. For instance, statistical units may have heterogeneous origins from distinct studies or subpopulations, and features may be naturally partitioned based on experimental platforms generating them, or on information available about their roles in a given phenomenon. In a regression analysis, exploiting this known structure in the predictor dimension reduction stage that precedes modeling can be an effective way to integrate diverse data. To pursue this, we propose a novel Sufficient Dimension Reduction (SDR) approach that we call structured Ordinary Least Squares (sOLS). This combines ideas from existing SDR literature to merge reductions performed within groups of samples and/or predictors. In particular, it leads to a version of OLS for grouped predictors that requires far less computation than recently proposed groupwise SDR procedures, and provides an informal yet effective variable selection tool in these settings. We demonstrate the performance of sOLS by simulation and present a first application to genomic data. The R package “sSDR,” publicly available on CRAN, includes all procedures necessary to implement the sOLS approach.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here