Web-based extraction of semantic relation instances for terminology work
Author(s) -
Jakob Halskov,
Caroline Barrière
Publication year - 2008
Publication title -
terminology international journal of theoretical and applied issues in specialized communication
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.175
H-Index - 23
eISSN - 1569-9994
pISSN - 0929-9971
DOI - 10.1075/term.14.1.03hal
Subject(s) - computer science , relation (database) , heuristics , information retrieval , unified medical language system , terminology , domain (mathematical analysis) , relationship extraction , ranking (information retrieval) , natural language processing , semantic relation , domain knowledge , artificial intelligence , data mining , linguistics , mathematics , mathematical analysis , philosophy , cognition , neuroscience , biology , operating system
This article describes the implementation and evaluation of WWW2REL, a domain-independent and pattern-based knowledge discovery system which extracts semantic relation instances from text fragments on the WWW so as to assist terminologists updating or expanding existing ontologies. Unlike most comparable systems, WWW2REL is special in that it can be applied to any semantic relation type and operates directly on unannotated and uncategorized WWW text snippets rather than static repositories of academic papers from the target domain. The WWW is used for knowledge pattern (KP) discovery, KP filtering and relation instance discovery. The system is tested with the help of the biomedical UMLS Metathesaurus for four different relation types and is manually evaluated by four domain experts. This system evaluation shows how ranking relation instances by a measure of "knowledge pattern range" and applying two heuristics yields an average performance of 70% to 65% of the maximum possible F-score by top 10 and top 50 instances, respectively. Importantly, results show that much valuable information not present in the UMLS can be found through the proposed method. Finally, the article examines the domain-dependence of different aspects of the pattern-based knowledge discovery approach proposed.Peer reviewed: YesNRC publication: Ye
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom