
Optimizing Count Responses in Surveys: A Machine-learning Approach
Author(s) -
Qiang Fu,
Xin Guo,
Kenneth C. Land
Publication year - 2018
Publication title -
sociological methods and research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.468
H-Index - 76
eISSN - 1552-8294
pISSN - 0049-1241
DOI - 10.1177/0049124117747302
Subject(s) - censoring (clinical trials) , count data , poisson distribution , computer science , bayesian probability , multinomial distribution , statistics , machine learning , artificial intelligence , mathematics
Count responses with grouping and right censoring have long been used in surveys to study a variety of behaviors, status, and attitudes. Yet grouping or right-censoring decisions of count responses still rely on arbitrary choices made by researchers. We develop a new method for evaluating grouping and right-censoring decisions of count responses from a (semisupervised) machine-learning perspective. This article uses Poisson multinomial mixture models to conceptualize the data-generating process of count responses with grouping and right censoring and demonstrates the link between grouping-scheme choices and asymptotic distributions of the Poisson mixture. To search for the optimal grouping scheme maximizing objective functions of the Fisher information (matrix), an innovative three-step M algorithm is then proposed to process infinitely many grouping schemes based on Bayesian A-, D-, and E-optimalities. A new R package is developed to implement this algorithm and evaluate grouping schemes of count responses. Results show that an optimal grouping scheme not only leads to a more efficient sampling design but also outperforms a nonoptimal one even if the latter has more groups.