z-logo
open-access-imgOpen Access
Concept based Focused Crawling using Ontology
Author(s) -
Author S.Thenmalar,
T. V. Geetha
Publication year - 2011
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/3115-4282
Subject(s) - focused crawler , computer science , web crawler , information retrieval , ontology , crawling , web page , world wide web , domain (mathematical analysis) , rank (graph theory) , static web page , web navigation , medicine , philosophy , epistemology , mathematical analysis , mathematics , combinatorics , anatomy
The constraint of a web crawler that downloads only relevant pages is still a major challenge in the field of information retrieval systems. Rather than visiting all the web pages, a focused crawler visits only the section of the web that contains relevant pages, and at the same time, tries to skip irrelevant sections. Existing ontology based web crawlers estimate the semantic content of the URL based on a domain dependent ontology, which in turn supports the methods used for prioritizing the URL queue. The crawler maintains a queue of URLs it has seen during the crawl at each level, and then selects from this queue, the next URL to visit based on the conceptual rank of the page at that level obtained from domain ontology. However in this work we represent the topic as an overall conceptual vector, obtained by combining concept vectors of individual pages associated with seed URLs. The conceptual rank is based on comparison between conceptual vectors at each depth, across depths and between the overall topics indicating seed concept vector. General Terms Data and Web Mining.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom