z-logo
open-access-imgOpen Access
A Short Text Classification Algorithm Based on Semantic Extension
Author(s) -
Yajian Zhou,
Dingpeng Deng,
Junhui Chi
Publication year - 2021
Publication title -
chinese journal of electronics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.267
H-Index - 25
eISSN - 2075-5597
pISSN - 1022-4653
DOI - 10.1049/cje.2020.11.014
Subject(s) - word2vec , extension (predicate logic) , computer science , similarity (geometry) , sentence , scheme (mathematics) , artificial intelligence , natural language processing , algorithm , semantic similarity , pattern recognition (psychology) , data mining , mathematics , mathematical analysis , embedding , image (mathematics) , programming language
A semantic‐extension‐based algorithm for short texts is proposed, by involving the Word2vec and the LDA model, to improve the performance of classification, which is frequently deteriorated by semantic dependencies and scarcity of features. For every keyword within a short text, weighted synonyms and related words can be generated by the Word2Vec and LDA model, respectively, and subsequently be inserted to extend the short text to a reasonable length. We not only have established a criterion by means of similarity estimation to determine whether a sentence should be extended, we designed a scheme to choose the number of extended words. The extended text will be classified. Experimental results show that, the classification performance of the proposed algorithm, in terms of the precision rate, is approximately 5% higher than that of the TF‐IDF model and approximately 10% higher than that of the VSM method.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here