Open Access
Domain Specific Entity Recognition With Semantic-Based Deep Learning Approach
Author(s) -
Quoc Hung Ngo,
Tahar Kechadi,
Nhien-An Le-Khac
Publication year - 2021
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2021.3128178
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
In digital agriculture, agronomists are required to make timely, profitable and more actionable precise decisions based on knowledge and experience. The input can be cultivated and related agricultural data, and one of them is text data, including news articles, business news, policy documents, or farming notes. To process this kind of data, identifying agricultural entities in the text is necessary to update news with agricultural orientation. This task is called Agriculture Entity Recognition (AGER - a kind of Named Entity Recognition task, NER, in the agriculture domain). However, there are very few approaches on AGER because of a lack of the consistent tagset and resources. In this study, we developed a new tagset for AGER to cover popular concepts in agriculture and we also propose a process for this task that consists of two stages: in the first stage, we use semantic-based approaches for detecting agricultural entities and semi-automatically build an annotated corpus of agricultural entities, while in the second stage, we identify the agricultural entities from the plain text using a deep learning approach, train on the annotated corpus. For the evaluation and validation, we build an annotated agriculture corpus and demonstrated the efficiency and robustness of our approach.