z-logo
open-access-imgOpen Access
A Partial String Matching Approach for Named Entity Recognition in Unstructured Bengali Data
Author(s) -
Nabil Ibtehaz,
Abdus Satter
Publication year - 2018
Publication title -
international journal of modern education and computer science
Language(s) - English
Resource type - Journals
eISSN - 2075-017X
pISSN - 2075-0161
DOI - 10.5815/ijmecs.2018.01.04
Subject(s) - bengali , computer science , string searching algorithm , unstructured data , string (physics) , artificial intelligence , trie , matching (statistics) , natural language processing , named entity recognition , domain (mathematical analysis) , information retrieval , pattern matching , data mining , data structure , big data , programming language , task (project management) , statistics , mathematics , mathematical analysis , physics , management , quantum mechanics , economics
In today's data driven, automated and digitized world, a significant stage of information extraction is to look for special keywords, more formally known as 'Named Entity'. This has been an active research topic for more than two decades and significant progresses have been made. Today we have models powered by deep learning that, although not perfect, have near human level accuracy on certain occasions. Unfortunately these algorithms require a lot of annotated training data, which we hardly have for Bengali language. This paper proposes a partial string matching approach to identify a named entity from an unstructured text corpus in Bengali. The algorithm is a partial string matching technique, based on Breadth First Search (BFS) search on a Trie data structure, augmented with dynamic programming. This technique is capable of not only identifying namedentities present on a text, but also estimating the actual named-entities from erroneous data. To evaluate the proposed technique, we conducted experiments in a closed domain where we employed this approach on a text corpus with some predefined named entities. The texts experimented on was both structured and unstructured, and our algorithm managed to succeed in both the cases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom