
Deep recurrent neural networks with word embeddings for Urdu named entity recognition
Author(s) -
Khan Wahab,
Daud Ali,
Alotaibi Fahd,
Aljohani Naif,
Arafat Sachi
Publication year - 2020
Publication title -
etri journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.295
H-Index - 46
eISSN - 2233-7326
pISSN - 1225-6463
DOI - 10.4218/etrij.2018-0553
Subject(s) - computer science , artificial intelligence , conditional random field , named entity recognition , word embedding , natural language processing , language model , recurrent neural network , task (project management) , word (group theory) , deep learning , benchmark (surveying) , artificial neural network , context (archaeology) , urdu , machine translation , embedding , speech recognition , paleontology , linguistics , philosophy , management , geodesy , economics , biology , geography
Named entity recognition (NER) continues to be an important task in natural language processing because it is featured as a subtask and/or subproblem in information extraction and machine translation. In Urdu language processing, it is a very difficult task. This paper proposes various deep recurrent neural network (DRNN) learning models with word embedding. Experimental results demonstrate that they improve upon current state‐of‐the‐art NER approaches for Urdu. The DRRN models evaluated include forward and bidirectional extensions of the long short‐term memory and back propagation through time approaches. The proposed models consider both language‐dependent features, such as part‐of‐speech tags, and language‐independent features, such as the “context windows” of words. The effectiveness of the DRNN models with word embedding for NER in Urdu is demonstrated using three datasets. The results reveal that the proposed approach significantly outperforms previous conditional random field and artificial neural network approaches. The best f‐measure values achieved on the three benchmark datasets using the proposed deep learning approaches are 81.1%, 79.94%, and 63.21%, respectively.