For testing the significance of regression coefficients, go ahead and log‐transform count data | Zendy

Ives Anthony R. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

For testing the significance of regression coefficients, go ahead and log‐transform count data

Author(s) -

Ives Anthony R.

Publication year - 2015

Publication title -

methods in ecology and evolution

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.425

H-Index - 105

ISSN - 2041-210X

DOI - 10.1111/2041-210x.12386

Subject(s) - generalized linear model , statistics , mathematics , count data , negative binomial distribution , poisson distribution , estimator , linear regression , quasi likelihood , sample size determination , regression analysis , linear model

Summary The rise in the use of statistical models for non‐Gaussian data, such as generalized linear models ( GLM s) and generalized linear mixed models ( GLMM s), is pushing aside the traditional approach of transforming data and applying least‐squares linear models ( LM s). Nonetheless, many least‐squares statistical tests depend on the variance of the sum of residuals, which by the Central Limit Theorem converge to a Gaussian distribution for large sample sizes. Therefore, least‐squares LM s will likely have good performance in assessing the statistical significance of regression coefficients. Using simulations of count data, I compared GLM approaches for testing whether regression coefficients differ from zero with the traditional approach of applying LM s to transformed data. Simulations assumed that variation among sample populations was either (i) negative binomial or (ii) log‐normal Poisson (i.e. log‐normal variation among populations that were then sampled by a Poisson distribution). I used the simulated data to conduct tests of the hypotheses that regression coefficients differed from zero; I did not investigate statistical properties of the coefficient estimators, such as bias and precision. For negative binomial simulations whose assumptions closely matched the GLM s, the GLM s were nonetheless prone to type I errors (false positives) especially when there was more than one predictor (independent) variable. After correcting for type I errors, however, the GLM s provided slightly better statistical power than LM s. For log‐normal‐Poisson simulations, both a GLMM and the LM s performed well, but under some simulated conditions the GLM s had high type I error rates, a deadly sin for statistical tests. These results show that, while GLM s have slight advantages in power when they are properly specified, they can lead to badly wrong conclusions about the significance of regression coefficients if they are mis‐specified. In contrast, transforming data and applying least‐squares linear analyses provide robust statistical tests for significance over a wide range of conditions. Thus, the traditional approach of transforming data and applying LM s is still useful.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore