Premium
Poststratification fusion learning in longitudinal data analysis
Author(s) -
Tang Lu,
Song Peter X.K.
Publication year - 2021
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.13333
Subject(s) - computer science , stratification (seeds) , causal inference , attrition , inference , machine learning , statistical inference , longitudinal data , sample size determination , econometrics , missing data , artificial intelligence , statistical power , data mining , statistics , mathematics , medicine , seed dormancy , botany , germination , dentistry , dormancy , biology
Stratification is a very commonly used approach in biomedical studies to handle sample heterogeneity arising from, for examples, clinical units, patient subgroups, or missing‐data. A key rationale behind such approach is to overcome potential sampling biases in statistical inference. Two issues of such stratification‐based strategy are (i) whether individual strata are sufficiently distinctive to warrant stratification, and (ii) sample size attrition resulted from the stratification may potentially lead to loss of statistical power. To address these issues, we propose a penalized generalized estimating equations approach to reducing the complexity of parametric model structures due to excessive stratification. Specifically, we develop a data‐driven fusion learning approach for longitudinal data that improves estimation efficiency by integrating information across similar strata, yet still allows necessary separation for stratum‐specific conclusions. The proposed method is evaluated by simulation studies and applied to a motivating example of psychiatric study to demonstrate its usefulness in real world settings.