Early Steps Toward Web‐Scale Information Extraction with LODIE
Author(s) -
Gentile Anna Lisa,
Zhang Ziqi,
Ciravegna Fabio
Publication year - 2015
Publication title -
ai magazine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.597
H-Index - 79
eISSN - 2371-9621
pISSN - 0738-4602
DOI - 10.1609/aimag.v36i1.2567
Subject(s) - computer science , information extraction , task (project management) , scale (ratio) , information retrieval , data extraction , resource (disambiguation) , representation (politics) , information resource , data science , world wide web , data mining , engineering , knowledge management , systems engineering , computer network , physics , medline , quantum mechanics , politics , law , political science
Information extraction (IE) is the technique for transforming unstructured textual data into a structured representation that can be understood by machines. The exponential growth of the web generates an exceptional quantity of data for which automatic knowledge capture is essential. This work describes the methodology for web‐scale information extraction in the linked open data information‐extraction (LODIE) project and highlights results from the early experiments carried out in the initial phase of the project. LODIE aims to develop information‐extraction techniques able to scale at web level and adapt to user information needs. The core idea behind LODIE is the usage of linked open data, a very large‐scale information resource, as a ground‐breaking solution for IE, which provides invaluable annotated data on a growing number of domains. This article has two objectives, first, describing the LODIE project as a whole and depicting its general challenges and directions; and second, describing some initial steps taken toward the general solution, focusing on a specific IE sub‐task, wrapper induction.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom