
Design and development of text extraction and retrieval using style of documents in web searching
Author(s) -
Shilpa Balan,
P. Ponmuthuramalingam
Publication year - 2017
Publication title -
international journal of engineering and technology
Language(s) - English
Resource type - Journals
ISSN - 2227-524X
DOI - 10.14419/ijet.v7i1.2.9038
Subject(s) - information retrieval , computer science , precision and recall , pruning , task (project management) , web page , segmentation , search engine , suffix , information extraction , natural language processing , artificial intelligence , world wide web , management , agronomy , economics , biology , linguistics , philosophy
This research focuses on study and extraction of web pages and documents are returned from goggle search engine. The useful task of web is to exactly match the accurate information. That information are categorized into many ways such as manual, structured, semi-structured texts and images. Query Result Records (QRR’s) is used to extract the text information from the different type of documents. Data region is used to identify the actual segmentation step and the domain of documents contains suffix and prefix. Time compared to the existing pruning and other techniques are more efficient in manner. We analyze the different type of alignments in this paper and propose a new technique for alignment retrieval to find precision and recall evaluating the retrieval performance.