z-logo
open-access-imgOpen Access
A Text Category Detection and Information Extraction Algorithm with Deep Learning
Author(s) -
Xiaohan Wu,
Ziyu Wu,
Yuqi Feng
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1982/1/012047
Subject(s) - softmax function , computer science , artificial intelligence , sentence , natural language processing , data set , set (abstract data type) , artificial neural network , word (group theory) , key (lock) , pattern recognition (psychology) , machine learning , mathematics , geometry , computer security , programming language
In order to solve the problem that the text classification model based on neural network is easy to over-fit and ignore the key words in sentences in the training process, a Bi-GRU Chinese text classification model based on hierarchical Attention mechanism is proposed. The model introduces the idea of layering, uses bi-directional gated cyclic neural network to learn the text representation at word level and sentence level, uses Self-Attention hierarchical model to obtain the information of the influence of words and sentences on text classification, shares the weight between embedded layer and softmax layer by binding, and uses AMSBound optimization method to obtain the optimal weight matrix quickly and effectively while reducing the parameters in the model. Two commonly used Chinese data sets, FudanSet and THUCNews, are tested on the long Chinese text classification data set FudanSet. The experimental results show that the accuracy, recall rate and F-score of this model are better than Text-CNN model, Attention-BiLSTM model and Bi-GRU_CNN model, and the accuracy, recall rate and F-score index are improved by 5.9%, 5.8% and 4.6%, respectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here