z-logo
Premium
Effect of Intracluster Correlation on the R ‐Square Statistic
Author(s) -
Weerakkody Govinda J.,
Givaruangsawat Sumalee
Publication year - 1999
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/(sici)1521-4036(199910)41:6<697::aid-bimj697>3.0.co;2-t
Subject(s) - statistics , estimator , statistic , mathematics , correlation , cluster sampling , sampling (signal processing) , press statistic , mean squared error , variance (accounting) , ancillary statistic , regression analysis , f test , computer science , demography , geometry , filter (signal processing) , accounting , sociology , business , computer vision , population
R 2 ‐statistic is a popular and very widely used statistic in regression analysis to estimate the square multiple correlation (SMC), ρ 2 , between a response variable Y and p predictor variables, X 1 , …, X p . Numerous articles are available in the statistical literature on the properties of R 2 as an estimator of ρ 2 when the observations are uncorrelated. However, relatively little is known about the behavior of R 2 when the available observations are correlated such as the data that result from complex sampling schemes. In this paper, we study the behavior R 2 in the presence of two‐stage sampling data. An approximate expressions for the variance and the bias of R 2 in the presence of two‐stage cluster sampling data with positive intracluster correlation (ρ*) are obtained. It is evident from these formulas and from a simulation study that R 2 is a poor estimator of ρ 2 except when ρ* is small. As such, we consider several alternative estimators of ρ 2 and evaluate their theoretical properties and finite sample performance using a simulation study.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here