Open Access
How to Improve the Reuse of Clinical Data-- openEHR and OMOP CDM
Author(s) -
Bei Li,
Rich Tsui
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1624/3/032041
Subject(s) - reuse , computer science , data exchange , software , process (computing) , data science , software engineering , data modeling , big data , database , systems engineering , data mining , engineering , operating system , programming language , waste management
All medical big data reuse projects are faced with the challenging problem of collecting and transforming heterogeneous data from different sources in a distributed research network. Both openEHR and OMOP CDM are open source tools for medical data. In this paper, the principles, feasibility, implementation, and characteristics of the two main clinical data secondary use methods are compared and discussed. We analyzed two data conversion frameworks in the medical data secondary utilization project conducted in China and the United States, and summarized the experience of designing the data ETL process, and compared the principles, implementation, characteristics between openEHR-based data acquisition system and reusing medical data approach based on Common Data Model with literature. OpenEHR from the Scandinavian countries is one of promising two-level modeling approach to extract data from various medical databases. It separates the operations of medical experts and software engineers, and changes in medical knowledge can be embedded in the new prototypes without affecting the EHR system. However, some shortcomings overshadow its advantages, such as poor compatibility with medical data other than EHR, difficulties in defining prototypes, steep learning curve, and the lack of mature development tools and guidelines. We adopted a minimalist data transformation model in Xiangya medical big data acquisition system based on openEHR to solve the large-scale data exchange problem faced by the distributed clinical data center. Many experimental projects have proved the feasibility and utility of OMOP CDM for multiple, disparate health databases. This is why it is widely used for the model framework of patient-level prediction and safety surveillance, including a transformation from source data into standard vocabulary, which solves semantic interoperability; technology neutrality that does not rely on special computer technology; open community, open resources, free tools; generating aggregated analysis results directly from desensitized data, etc. Some issues should be under consideration in the use. Not all source data encodings can be converted to standard vocabulary, and there will be a loss of semantics, and concepts matching requires a lot of time and effort. The model and vocabulary were originally developed and designed for pharmaceutical safety research and clinical observation data, while the development of vocabularies in other fields is limited. In conclusion, both openEHR and CDMs are designed for exporting and reusing data from a distributed clinical database. The former is suitable for collecting data from distributed EHR systems and building medical big data warehouses, while the latter is a better model for sharing data in some decentralized medical database.