z-logo
Premium
On distribution‐weighted partial least squares with diverging number of highly correlated predictors
Author(s) -
Zhu LiPing,
Zhu LiXing
Publication year - 2009
Publication title -
journal of the royal statistical society: series b (statistical methodology)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.523
H-Index - 137
eISSN - 1467-9868
pISSN - 1369-7412
DOI - 10.1111/j.1467-9868.2008.00697.x
Subject(s) - mathematics , least squares function approximation , estimator , generalized least squares , asymptotic distribution , covariance , partial least squares regression , total least squares , statistics , dimension (graph theory) , non linear least squares , regression , combinatorics
Summary.  Because highly correlated data arise from many scientific fields, we investigate parameter estimation in a semiparametric regression model with diverging number of predictors that are highly correlated. For this, we first develop a distribution‐weighted least squares estimator that can recover directions in the central subspace, then use the distribution‐weighted least squares estimator as a seed vector and project it onto a Krylov space by partial least squares to avoid computing the inverse of the covariance of predictors. Thus, distrbution‐weighted partial least squares can handle the cases with high dimensional and highly correlated predictors. Furthermore, we also suggest an iterative algorithm for obtaining a better initial value before implementing partial least squares. For theoretical investigation, we obtain strong consistency and asymptotic normality when the dimension p of predictors is of convergence rate O { n 1/2 / log ( n )} and o ( n 1/3 ) respectively where n is the sample size. When there are no other constraints on the covariance of predictors, the rates n 1/2 and n 1/3 are optimal. We also propose a Bayesian information criterion type of criterion to estimate the dimension of the Krylov space in the partial least squares procedure. Illustrative examples with a real data set and comprehensive simulations demonstrate that the method is robust to non‐ellipticity and works well even in ‘small n –large p ’ problems.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here