Reducing storage requirements for biological sequence comparison | Zendy

Michael Roberts | Zendy; Wayne B. Hayes | Zendy; Brian R. Hunt | Zendy; Stephen M. Mount | Zendy; James A. Yorke | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Reducing storage requirements for biological sequence comparison

Author(s) -

Michael Roberts,

Wayne B. Hayes,

Brian R. Hunt,

Stephen M. Mount,

James A. Yorke

Publication year - 2004

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/bth408

Subject(s) - substring , string (physics) , string searching algorithm , computer science , sequence (biology) , computation , matching (statistics) , fraction (chemistry) , process (computing) , approximate string matching , simple (philosophy) , pattern matching , genome , algorithm , theoretical computer science , data mining , biology , mathematics , data structure , artificial intelligence , genetics , statistics , gene , programming language , chemistry , philosophy , organic chemistry , mathematical physics , epistemology

Comparison of nucleic acid and protein sequences is a fundamental tool of modern bioinformatics. A dominant method of such string matching is the 'seed-and-extend' approach, in which occurrences of short subsequences called 'seeds' are used to search for potentially longer matches in a large database of sequences. Each such potential match is then checked to see if it extends beyond the seed. To be effective, the seed-and-extend approach needs to catalogue seeds from virtually every substring in the database of search strings. Projects such as mammalian genome assemblies and large-scale protein matching, however, have such large sequence databases that the resulting list of seeds cannot be stored in RAM on a single computer. This significantly slows the matching process.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research