Premium
Estimating person parameters via item response model and simple sum score in small samples with few polytomous items: A simulation study
Author(s) -
Schwall Philipp,
Meesters Christian,
Hardt Jochen
Publication year - 2019
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.8280
Subject(s) - item response theory , skewness , polytomous rasch model , skew , statistics , homogeneous , econometrics , scale (ratio) , mathematics , computer science , psychometrics , telecommunications , physics , combinatorics , quantum mechanics
Background: The Item Response Theory (IRT) is becoming increasingly popular for item analysis. Theoretical considerations and simulation studies suggest that parameter estimates will become precise only by utilizing many items in large samples. Method: A simulation study focusing on a single scale was performed on data with (a) n = 40, 60, 80, 120, 200, 300, 500, and 900 cases utilizing (b) 4, 8, 16, or 32 items. The items were (c) symmetrically distributed vs. skew (skewness 0, 1, and 2). Item loadings were (d) homogeneous vs. heterogeneous. Item loadings were (e) low vs. high. Half of the items had (f) a correlated error or not. The number of answering categories (g) was four vs. five. A total of 10% of each item had missing values. The ability‐estimates from the IRT model and the simple sum score served as criteria for evaluating the results. Results: The ability‐estimate from the IRT model outperformed the sum score when there were many items, skewed distributed items, and the item loadings were heterogeneous and high. The sum score outperformed the ability‐estimate when there were few items, nonskewed items, and homogeneous and low item loadings. However, convergence rates were partly low in small samples. Correlated errors affected, both negatively, the ability‐estimate and the sum score. Conclusion: With skew item distributions and heterogeneous item loadings, utilizing an IRT model is recommended. However, with few items, many cases are required, conversely, with few cases many items. With few items and few cases, the sum score performs better.