
Relation extraction in structured and unstructured data: a comparative investigation on smartphone titles in the e-commerce domain
Author(s) -
João Gabriel Melo Barbirato,
Livy Real,
Helena de Medeiros Caseli
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/stil.2021.17789
Subject(s) - unstructured data , computer science , relationship extraction , relation (database) , task (project management) , context (archaeology) , information retrieval , information extraction , domain (mathematical analysis) , relational database , data extraction , product (mathematics) , data science , data mining , natural language processing , big data , mathematics , mathematical analysis , geometry , management , medline , political science , law , economics , paleontology , biology
As large amounts of unstructured data are generated on a regular basis, expressing or storing knowledge in a way that is useful remains a challenge. In this context, Relation Extraction (RE) is the task of automatically identifying relationships in unstructured textual data. Thus, we investigated the relation extraction on unstructured e-commerce data from the smartphone domain, using a BERT model fine-tuned for this task. We conducted two experiments to acknowledge how much relational information it is possible to extract from product sheets (structured data) and product titles (unstructured data), and a third experiment to compare both. Analysis shows that extracting relations within a title can retrieve correct relations that are not evident on the related sheet.