Research on Semantic Similarity of Short Text Based on Bert and Time Warping Distance | Zendy

Shijie Qiu | Zendy; Yan Niu | Zendy; Jun Li | Zendy; Xing Li | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Research on Semantic Similarity of Short Text Based on Bert and Time Warping Distance

Author(s) -

Shijie Qiu,

Yan Niu,

Jun Li,

Xing Li

Publication year - 2021

Publication title -

journal of web engineering/journal of web engineering on line

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.151

H-Index - 13

eISSN - 1544-5976

pISSN - 1540-9589

DOI - 10.13052/jwe1540-9589.20814

Subject(s) - semantic similarity , computer science , similarity (geometry) , artificial intelligence , feature (linguistics) , feature vector , dynamic time warping , natural language processing , vector space model , point (geometry) , ambiguity , sequence (biology) , pattern recognition (psychology) , information retrieval , mathematics , image (mathematics) , linguistics , philosophy , geometry , biology , genetics , programming language

The research on semantic similarity of short text plays an important role in machine translation, emotion analysis, information retrieval and other AI business applications. However, according to existing short text similarity research, the characteristics of ambiguous vocabularies are difficult to be effectively analyzed, the solution of the problem caused by words order needs to be further optimized as well. This paper proposes a short text semantic similarity calculation method that combines BERT and time warping distance algorithm, in order to solve the problem of vocabulary ambiguity. The model first uses the pre trained Bert model to extract the semantic features of the short text from the whole level, and obtains a 768 dimensional short text feature vector. Then, it transforms the extracted feature vector into a point sequence in space, uses the CTW algorithm to calculate the time warping distance between the curves connected by the point sequence, and finally uses the weight function designed by the analysis, according to the smaller the time warpage distance is, the higher the degree of small similarity is, to calculate the similarity between short texts. The experimental results show that this model can mine the feature information of ambiguous words, and calculate the similarity of short texts with lexical ambiguity effectively. Compared with other models, it can distinguish the semantic features of ambiguous words more accurately.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore