
Research on Named Entity Recognition Method Based on Improved LSTM-CRF Model
Author(s) -
Yong Gan,
Dongwei Jia,
Yifan Wang
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2025/1/012004
Subject(s) - computer science , word2vec , artificial intelligence , natural language processing , polysemy , transformer , vocabulary , word (group theory) , character (mathematics) , language model , artificial neural network , natural language , linguistics , philosophy , physics , geometry , mathematics , embedding , quantum mechanics , voltage
Because the computer cannot directly understand the text corpus in the NLP task, the first thing to do is to represent the characteristics of the natural language numerically, and the word vector technology provides a good way to express it. Because Word2vec considers context and has fewer dimensions, it is now more popular words embedded. However, due to the particularity of Chinese, word2vec cannot accurately identify the polysemy of words. In this paper, a lightweight and effective method is used to merge vocabulary into character representation. This approach avoids designing complex sequence modeling architectures. for any neural network model, simply fine-tuning the character input layer can introduce vocabulary information. The model also uses the modified LSTM to bridge the enormous LSTM and the Transformer model. The interaction between input and context provides a richer modeling space that significantly improves testing on all four public datasets.