Ontology-based extraction and structuring of information from data-rich unstructured documents | Zendy

David W. Embley | Zendy; Douglas M. Campbell | Zendy; Randy D. Smith | Zendy; Stephen W. Liddle | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Ontology-based extraction and structuring of information from data-rich unstructured documents

Author(s) -

David W. Embley,

Douglas M. Campbell,

Randy D. Smith,

Stephen W. Liddle

Publication year - 1998

Publication title -

citeseer x (the pennsylvania state university)

Language(s) - English

Resource type - Conference proceedings

ISBN - 1-58113-061-9

DOI - 10.1145/288627.288641

Subject(s) - ontology , computer science , structuring , unstructured data , information extraction , information retrieval , world wide web , data mining , big data , philosophy , epistemology , finance , economics

We present a new approach to extracting information from unstructured documents based on an application ontology that describes a domain of interest. Starting with such an ontology, we formulate rules to extract constants and context keywords from unstructured documents. For each unstructured document of interest, we extract its constants and keywords and apply a recognizer to organize extracted constants as attribute values of tuples in a generated database schema. To make our approach general, we fix all the processes and change only the ontological description for a different application domain. In experiments we conducted on two different types of unstructured documents taken from the Web, our approach attained recall ratios in the 80% and 90% range and precision ratios near 98%.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research