z-logo
open-access-imgOpen Access
Head and Tail Entity Fusion Model in Medical Knowledge Graph Construction: Case Study in Pituitary Adenoma (Preprint)
Author(s) -
An Fang,
Pei Lou,
Jiahui Hu,
Wanqing Zhao,
Ming Feng,
Huiling Ren,
Xianlai Chen
Publication year - 2021
Publication title -
jmir medical informatics
Language(s) - English
Resource type - Journals
ISSN - 2291-9694
DOI - 10.2196/28218
Subject(s) - computer science , pituitary adenoma , artificial intelligence , information retrieval , machine learning , natural language processing , adenoma , data mining , medicine , pathology
Background Pituitary adenoma is one of the most common central nervous system tumors. The diagnosis and treatment of pituitary adenoma remain very difficult. Misdiagnosis and recurrence often occur, and experienced neurosurgeons are in serious shortage. A knowledge graph can help interns quickly understand the medical knowledge related to pituitary tumor. Objective The aim of this study was to develop a data fusion method suitable for medical data using data of pituitary adenomas integrated from different sources. The overall goal was to construct a knowledge graph for pituitary adenoma (KGPA) to be used for knowledge discovery. Methods A complete framework suitable for the construction of a medical knowledge graph was developed, which was used to build the KGPA. The schema of the KGPA was manually constructed. Information of pituitary adenoma was automatically extracted from Chinese electronic medical records (CEMRs) and medical websites through a conditional random field model and newly designed web wrappers. An entity fusion method is proposed based on the head-and-tail entity fusion model to fuse the data from heterogeneous sources. Results Data were extracted from 300 CEMRs of pituitary adenoma and 4 health portals. Entity fusion was carried out using the proposed data fusion model. The F1 scores of the head and tail entity fusions were 97.32% and 98.57%, respectively. Triples from the constructed KGPA were selected for evaluation, demonstrating 95.4% accuracy. Conclusions This paper introduces an approach to fuse triples extracted from heterogeneous data sources, which can be used to build a knowledge graph. The evaluation results showed that the data in the KGPA are of high quality. The constructed KGPA can help physicians in clinical practice.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here