z-logo
open-access-imgOpen Access
Research on discovering deep web entries
Author(s) -
Ying Wang,
Huilai Li,
Wanli Zuo,
Fengling He,
Xin Wang,
Kerui Chen
Publication year - 2011
Publication title -
computer science and information systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.244
H-Index - 24
eISSN - 2406-1018
pISSN - 1820-0214
DOI - 10.2298/csis100322028w
Subject(s) - computer science , web crawler , information retrieval , classifier (uml) , crawling , web page , ontology , domain (mathematical analysis) , semantic web , deep web , world wide web , focused crawler , artificial intelligence , static web page , the internet , web navigation , medicine , mathematical analysis , philosophy , mathematics , epistemology , anatomy
Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain- Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom