z-logo
Premium
Generating data sets for teaching the importance of regression analysis
Author(s) -
Murray Lori L.,
Wilson John G.
Publication year - 2021
Publication title -
decision sciences journal of innovative education
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.52
H-Index - 19
eISSN - 1540-4609
pISSN - 1540-4595
DOI - 10.1111/dsji.12233
Subject(s) - computer science , descriptive statistics , regression analysis , statistical inference , linear regression , statistics , multivariate statistics , statistical analysis , simple linear regression , data mining , machine learning , mathematics
Summary statistics and data visualizations are often used to explore data and draw preliminary conclusions. Although valuable, these tools do not always reveal the underlying patterns and trends in the data and can sometimes be misleading. We describe an approach for teaching the need for more advanced statistical analysis using multiple linear regression. Our approach is based on using a method we developed for generating alternative multivariate data sets where all the variables (both independent and dependent) have the same summary statistics. However, we can deliberately change the statistical significance of one (or more) of the independent variables in the regression to illustrate why it is important to go beyond simple descriptive measures and examine inferential statistics on the inherent relationships in the data. Implementation of this methodology is provided in the R statistical programming language and an add‐in for Excel spreadsheets.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here