
An architecture for scaling federated search
Author(s) -
Lederman Abe
Publication year - 2009
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.2009.1450460342
Subject(s) - computer science , scalability , world wide web , relevance (law) , data science , quality (philosophy) , ranking (information retrieval) , metasearch engine , search engine , viewpoints , architecture , information retrieval , web search query , database , art , philosophy , epistemology , political science , law , visual arts
Federated search has the tremendous potential to make a wide range of diverse information and viewpoints available to scientists, researchers and the public. Traditional federated search engines provide access to a relatively small number of content sources, – typically several dozen or fewer. But, depending on the discipline, there may exist hundreds of databases with relevant content. And, given the value of searching content sources in fields that are seemingly unrelated, a researcher will benefit from simultaneously searching hundreds, or thousands of sources. The greater the number of relevant and diverse high quality sources a researcher can access, the faster he or she will make discoveries that advance science and improve the quality of our lives. The current paradigm for federated search suffers from a number of problems that hinder the development of large and scalable federated search engines. Search speed, relevance ranking, and source selection all suffer in today's paradigm as the number of sources increases. Deep Web Technologies, a Santa Fe New Mexico‐based federated search technology company, is pioneering the effort to build applications that overcome these obstacles. Deep Web Technologies has created a hierarchical “divide‐and‐conquer” architecture that distributes the federated search work flow to eliminate the traditional bottlenecks and allow for massive scalability. Deep Web Technologies built the search engine behind WorldWideScience.org, a global gateway to government‐produced and government‐supported science research information. WorldWideScience.org employs Deep Web Technologies' hierarchical approach to search sources that are themselves federated search engines. WorldWideScience.org searches 140 sources through this approach. Deep Web Technologies is building, by mid‐2009, a 500‐source science research portal. Deep Web Technologies will describe, in the ASIS 2009 poster session, its architecture and how it facilitates large‐scale scalability of federated search applications.