Premium
A dictionary for translation from natural to formal data model language
Author(s) -
Šuman Sabrina,
Jakupović Alen,
Marinac Mladen
Publication year - 2021
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/coin.12393
Subject(s) - computer science , natural language processing , artificial intelligence , translation (biology) , computer assisted translation , machine translation , rule based machine translation , machine translation software usability , set (abstract data type) , information extraction , natural language , example based machine translation , linguistics , programming language , biochemistry , chemistry , philosophy , messenger rna , gene
The paper describes our current research activities and results related to developing knowledge‐based systems to support the creation of entity‐relationship (ER) models. The authors based obtaining an ER model in textual form on translation from one language into another, that is, from an English controlled natural language into the formalized language of an ER data model. Our translation method consisted of creating translation rules of sentential form parts into ER model constructs based on the textual and character patterns detected in the business descriptions. To enable the computer analyses necessary for creating translation mechanisms, we created a linguistic corpus that contains lists of the business descriptions and the texts of other business materials. From the corpus, we then created a specific dictionary and linguistic rules to automate the business descriptions' translation into the ER data model language. Before that, however, the corpus was enriched by adding annotations to the words related to ER data model constructs. In this paper, we also present the main issues uncovered during the translation process and offer a possible solution with utility evaluation: applying information‐extraction performance measures to a set of sentences from the corpus.