
Improved Chinese Sentence Semantic Similarity Calculation Method Based on Multi-Feature Fusion
Author(s) -
Liqi Liu,
Qinglin Wang,
Yuan Li
Publication year - 2021
Publication title -
journal of advanced computational intelligence and intelligent informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 20
eISSN - 1343-0130
pISSN - 1883-8014
DOI - 10.20965/jaciii.2021.p0442
Subject(s) - sentence , computer science , artificial intelligence , natural language processing , similarity (geometry) , word (group theory) , feature (linguistics) , semantics (computer science) , recurrent neural network , word order , representation (politics) , semantic similarity , word embedding , artificial neural network , embedding , linguistics , philosophy , politics , political science , law , image (mathematics) , programming language
In this paper, an improved long short-term memory (LSTM)-based deep neural network structure is proposed for learning variable-length Chinese sentence semantic similarities. Siamese LSTM, a sequence-insensitive deep neural network model, has a limited ability to capture the semantics of natural language because it has difficulty explaining semantic differences based on the differences in syntactic structures or word order in a sentence. Therefore, the proposed model integrates the syntactic component features of the words in the sentence into a word vector representation layer to express the syntactic structure information of the sentence and the interdependence between words. Moreover, a relative position embedding layer is introduced into the model, and the relative position of the words in the sentence is mapped to a high-dimensional space to capture the local position information of the words. With this model, a parallel structure is used to map two sentences into the same high-dimensional space to obtain a fixed-length sentence vector representation. After aggregation, the sentence similarity is computed in the output layer. Experiments with Chinese sentences show that the model can achieve good results in the calculation of the semantic similarity.