
Learning Sub-Character level representation for Korean Named Entity Recognition
Author(s) -
YeJin Kim,
Yekyung Kim
Publication year - 2021
Publication title -
proceedings of the ... international florida artificial intelligence research society conference
Language(s) - English
Resource type - Journals
eISSN - 2334-0762
pISSN - 2334-0754
DOI - 10.32473/flairs.v34i1.128509
Subject(s) - character (mathematics) , computer science , natural language processing , named entity recognition , artificial intelligence , representation (politics) , baseline (sea) , linguistics , mathematics , task (project management) , management , politics , political science , law , philosophy , oceanography , geometry , economics , geology
Most of the previous studies on the Korean Named Entity Recognition (NER) topic focused on utilizing morphological-level information because the language is rich in character diversity. This paper illustrates an improved unigram-level Korean NER model with sub-character level representation, jamo, which can represent a unique linguistic structure of Korean and its syntactic properties and morphological variations. The experimental result shows that exploiting sub-character gives us a boost of + (avg) 2 F1, also, our proposed C-GRAM model outperformed about 3 F1 comparing with the baseline.