Cross-Language Name Matching for Data Fusion in Linked Open Data
Author(s) -
Ziad F. Torkey,
Emad Elabd,
Mostafa Abdelazem
Publication year - 2015
Language(s) - English
Resource type - Conference proceedings
DOI - 10.15849/icit.2015.0081
Subject(s) - computer science , matching (statistics) , data integration , precision and recall , natural language processing , linked data , artificial intelligence , information retrieval , arabic , data mining , semantic web , linguistics , statistics , mathematics , philosophy
Data quality and accuracy affects the success of data integration in Linked Open Data (LOD). The main goal of data fusion is to represent each real-world entity once on the Web. Data inaccuracy problems exist due to misspelling and a wide range of typographical differences mainly in non-Latin languages, those problems become more complicated when a person is identified by a name, and this name can be presented differently in same/different languages. Up to author's knowledge, the previous approaches which supported Arabic person names are not designed to work with LOD. This paper proposes a framework that uses person names as matching criteria from cross-language LOD Datasets. The proposed framework has substantial improvements in matching results compared to state of the art framework of matching techniques with better matching rate which exceed 6% in precision and 6% in recall.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom