Drug Disease Relation Extraction from Biomedical Literature Using NLP and Machine Learning
Author(s) -
Wahiba Ben Abdessalem Karâa,
Eman H. Alkhammash,
Aida Bchir
Publication year - 2021
Publication title -
mobile information systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.346
H-Index - 34
eISSN - 1875-905X
pISSN - 1574-017X
DOI - 10.1155/2021/9958410
Subject(s) - unified medical language system , computer science , relationship extraction , natural language processing , artificial intelligence , support vector machine , classifier (uml) , information extraction , ontology , biomedical text mining , information retrieval , relation (database) , machine learning , field (mathematics) , text mining , data mining , philosophy , mathematics , epistemology , pure mathematics
Extracting the relations between medical concepts is very valuable in the medical domain. Scientists need to extract relevant information and semantic relations between medical concepts, including protein and protein, gene and protein, drug and drug, and drug and disease. )ese relations can be extracted from biomedical literature available on various databases. )is study examines the extraction of semantic relations that can occur between diseases and drugs. Findings will help specialists make good decisions when administering a medication to a patient and will allow them to continuously be up to date in their field. )e objective of this work is to identify different features related to drugs and diseases from medical texts by applying Natural Language Processing (NLP) techniques and UMLS ontology. )e Support Vector Machine classifier uses these features to extract valuable semantic relationships among text entities. )e contributing factor of this research is the combination of the strength of a suggested NLP technique, which takes advantage of UMLS ontology and enables the extraction of correct and adequate features (frequency features, lexical features, morphological features, syntactic features, and semantic features), and Support Vector Machines with polynomial kernel function. )ese features are manipulated to pinpoint the relations between drug and disease. )e proposed approach was evaluated using a standard corpus extracted from MEDLINE. )e finding considerably improves the performance and outperforms similar works, especially the f-score for the most important relation “cure,” which is equal to 98.19%. )e accuracy percentage is better than those in all the existing works for all the relations.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom