
An assertion and alignment correction framework for large scale knowledge bases
Author(s) -
Jiaoyan Chen,
Ernesto Jiménez-Ruiz,
Ian Horrocks,
Xi Chen,
Erik Bryhn Myklebust
Publication year - 2022
Publication title -
semantic web
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.862
H-Index - 45
eISSN - 2210-4968
pISSN - 1570-0844
DOI - 10.3233/sw-210448
Subject(s) - computer science , set (abstract data type) , information retrieval , assertion , natural language processing , consistency (knowledge bases) , context (archaeology) , embedding , literal (mathematical logic) , constraint (computer aided design) , semantics (computer science) , encyclopedia , artificial intelligence , programming language , mathematics , paleontology , geometry , library science , biology
Various knowledge bases (KBs) have been constructed via information extraction from encyclopedias, text and tables, as well as alignment of multiple sources. Their usefulness and usability is often limited by quality issues. One common issue is the presence of erroneous assertions and alignments, often caused by lexical or semantic confusion. We study the problem of correcting such assertions and alignments, and present a general correction framework which combines lexical matching, context-aware sub-KB extraction, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated with one set of literal assertions from DBpedia, one set of entity assertions from an enterprise medical KB, and one set of mapping assertions from a music KB constructed by integrating Wikidata, Discogs and MusicBrainz. It has achieved promising results, with a correction rate (i.e., the ratio of the target assertions/alignments that are corrected with right substitutes) of 70.1 %, 60.9 % and 71.8 %, respectively.