z-logo
open-access-imgOpen Access
Semi-Supervised Personal Name Disambiguation Technique for the Web
Author(s) -
P. Selvaperumal,
A. Suruliandi
Publication year - 2016
Publication title -
international journal of modern education and computer science
Language(s) - English
Resource type - Journals
eISSN - 2075-017X
pISSN - 2075-0161
DOI - 10.5815/ijmecs.2016.03.04
Subject(s) - computer science , cluster analysis , ambiguity , set (abstract data type) , information retrieval , web page , process (computing) , artificial intelligence , world wide web , programming language , operating system
Personal name ambiguity in the web arises when more than one person shares the same name. Personal name disambiguation involves disambiguating the name by clustering web page collection such that each cluster represents a person having the ambiguous name. In this paper, a personal name disambiguation technique that makes use of rich set of features like Nouns, Noun phrases, and frequent keywords as features is proposed. The proposed method consists of two phases namely clustering seed pages and then clustering the actual web page collection. In the first phase, seed pages representing different namesakes are clustered and in the second phase, web pages in the collection are clustered with the similar seed page clusters. The usage of seed pages increases the accuracy of clustering process. Since it is difficult to predict the number of clusters need to be formed beforehand, the proposed technique uses Elbow method to calculate the number of clusters. The efficiency of the proposed name disambiguation technique is tested using both synthetic and organic datasets. Experimental result shows the proposed method achieves robust results across different datasets and outperforms many existing methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom