
Word‐Based Method for Chinese Part‐of‐Speech via Parallel and Adversarial Network
Author(s) -
HUANG Kaiyu,
CAO Jingxiang,
LIU Zhuang,
HUANG Degen
Publication year - 2022
Publication title -
chinese journal of electronics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.267
H-Index - 25
eISSN - 2075-5597
pISSN - 1022-4653
DOI - 10.1049/cje.2020.00.411
Subject(s) - computer science , transformer , natural language processing , word (group theory) , artificial intelligence , task (project management) , sequence labeling , speech recognition , text segmentation , adversarial system , character (mathematics) , segmentation , linguistics , philosophy , physics , geometry , mathematics , management , quantum mechanics , voltage , economics
Chinese part‐of‐speech (POS) tagging is an essential task for Chinese downstream natural language processing tasks. The accuracy of the Chinese POS task will drop dramatically by word‐based methods because of the segmentation errors and the word sparsity. Also, there are several Chinese POS tagging sets with different criteria. Some of them only have a small‐scale annotated corpus and are hard to train. To this end, we propose a modified word‐based transformer neural network architecture. Meanwhile, we utilize an adversarial transfer learning method that splits the architecture into shared and private parts. This work directly improves the ability of the word‐based model, instead of adopting a joint character‐based method. Extensive experiments show that our method achieves state‐of‐the‐art performance on all datasets, and more importantly, our method improves performance effectively for the word‐based Chinese sequence labeling task.