z-logo
open-access-imgOpen Access
General-purpose search techniques for genomic text.
Author(s) -
Abhijit Chattaraj,
Hugh E Williams,
Adam Cannane
Publication year - 2004
Publication title -
genome informatics. international conference on genome informatics
Language(s) - English
DOI - 10.11234/gi1990.15.2_42
Fast and accurate techniques for searching large genomic text collections are becoming increasingly important. While Information Retrieval is well-established for general-purpose text retrieval tasks, less is known about retrieval techniques for genomic text data. In this paper, we investigate and propose general-purpose search techniques for genomic text. In particular, we show that significant improvements can result from manual term expansion, where additional words are added to queries and documents. We also show that collection partitioning, where documents are included in or excluded from the search space, is highly effective for some tasks. We experiment with our techniques on four text collections and show, for example, that the collection partitioning scheme can improve effectiveness by almost 9.5% over a standard retrieval baseline. We conclude by recommending techniques that can be considered for most genomic search tasks.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom