
PS1421 HONEUR (HEAMATOLOGY OUTCOMES NETWORK IN EUROPE) – DISTRIBUTED STATISTICS IN A FEDERATED MODEL TO SUPPORT REAL WORLD DATA RESEARCH IN HEMATOLOGY
Author(s) -
Passey A.,
Perualilia N.J.,
Bardenheuer K.,
Verbeke T.,
Nassiri V.,
Van Speybroeck M.
Publication year - 2019
Publication title -
hemasphere
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.677
H-Index - 11
ISSN - 2572-9241
DOI - 10.1097/01.hs9.0000563960.92802.d6
Subject(s) - medicine , cohort , multiple myeloma , lenalidomide , tolerability , clinical endpoint , adverse effect , incidence (geometry) , bortezomib , clinical trial , oncology , physics , optics
Background: The Haematology Outcomes Network in EURope (HONEUR) is an interdisciplinary initiative aimed at improving patient outcomes by analyzing real world data across hematological centers in Europe. Its overarching goal is to create a secure network which facilitates the development of a collaborative research community and allows access to big data tools for analysis of the data. The central paradigm in the HONEUR network is a federated model whereby the data stays at the respective sites and the analysis is executed at the local data sources. To allow for a uniform data analysis, the common data model ‘OMOP’ (Observational Medical Outcomes Partnership) was selected and extended to accommodate specific hematology data elements. While the federated model addresses ethico‐legal challenges for data pooling, it poses specific challenges when performing statistical analysis on pooled data. Aims: Enabling the use of privacy protecting distributed statistics across a federated network of databases Methods: To validate the feasibility and accuracy of distributed statistics in the HONEUR network, data from the EMMOS registry (NCT01241396) were used. This registry is a prospective, non‐interventional study that was designed to capture real world data regarding treatments and outcomes for multiple myeloma at different stages of the disease. Data was collected between Oct 2010 and Nov 2014 on more than 2,400 patients across 266 sites in 22 countries. After mapping data to the OMOP common data model version 5.3, the three most populated countries in the dataset: Germany, Italy and Russia with 363, 488 and 213 subjects, respectively, were selected. Therefore, a total of 1064 patients comprised the pooled dataset. In this work, the overall survival of the patients is modeled with age, sex and Salmon‐Durie stage as model covariates. Two types of analysis were performed: (1) the traditional Cox regression model stratified by country using the pooled dataset and (2) the same cox regression model distributed by country. The second analysis allows for estimating model parameters without the need for individual sites to share patient‐level data. This distributed model was implemented using a message broker to allow for minimal impact on the individual sites and is based on the R distcomp package from Narasimhan et al ., which allows for the calculation of the overall likelihood function across study sites and estimation of model parameters. Results: Using real world patient haematology data we were able to demonstrate identical results from the two analyses performed using the pooled dataset versus distributed dataset(s) as shown in Tables 1 and 2, respectively. The hazard ratios, 95% confidence intervals and p‐values are identical between the two models for all for levels of stage compared to reference, adjusted for age and gender indicating that the distributed model has generated precisely the same proportional hazard estimate as well as variance estimates for variables within the model. Summary/Conclusion: We have compared a standard pooled approach for a comparative outcomes study using clinically relevant variables with a distributed analysis using the same parameters where the analyst has no access to the patient level data, the methodology has generated a virtually identical underlying fitted model with precisely the same effect estimates for hazard, confidence intervals and p‐values from hypothesis testing indicating that this model has great potential for testing more complex comparative effectiveness modelling with accurate outputs which entirely protect patient privacy in a federated data model.