z-logo
Premium
Synthesizing external aggregated information in the penalized Cox regression under population heterogeneity
Author(s) -
Sheng Ying,
Sun Yifei,
Huang ChiungYu,
Kim MiOk
Publication year - 2021
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.9101
Subject(s) - estimator , computer science , flexibility (engineering) , estimating equations , empirical likelihood , population , econometrics , aggregate (composite) , variance (accounting) , oracle , statistics , mathematics , materials science , demography , accounting , sociology , business , composite material , software engineering
Synthesizing external aggregated information has been proven useful in improving estimation efficiency when conducting statistical analysis using a limited amount of data. In this paper, we develop a unified framework for combining information from high‐dimensional individual‐level data and potentially low‐dimensional external aggregate data under the Cox model. We summarize various forms of external aggregated information by population estimating equations and propose a penalized empirical likelihood approach to borrow information from these estimating equations. The proposed methods possess the flexibility to handle the case where individual‐level data and external aggregate data are from heterogeneous populations. Specifically, a penalized empirical likelihood ratio test is developed to check for the potential heterogeneity, and a semiparametric density ratio model is postulated to account for the heterogeneity. Moreover, we study the impact of uncertainty in the auxiliary information on the efficiency gain and propose a modified variance estimator to adjust for the uncertainty. The proposed estimators enjoy the oracle property and are asymptotically more efficient than the penalized partial likelihood estimator that does not exploit the external aggregated information. Simulation studies show improvement in both estimation efficiency and variable selection over the competitors. The proposed approaches are applied to the analysis of a pediatric kidney transplant study for illustration.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here