Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine
Author(s) -
Lu Zhou,
Shuangqiao Liu,
Caiyan Li,
Yuemeng Sun,
Yizhuo Zhang,
Yuda Li,
Huimin Yuan,
Yan Sun,
Fengqin Xu,
Yuhang Li
Publication year - 2021
Publication title -
evidence-based complementary and alternative medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.552
H-Index - 90
eISSN - 1741-4288
pISSN - 1741-427X
DOI - 10.1155/2021/6676607
Subject(s) - computer science , artificial intelligence , normalization (sociology) , language model , natural language processing , sigmoid function , recall , artificial neural network , linguistics , philosophy , sociology , anthropology
Background The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions.Methods Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score.Results The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics.Conclusions The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom