Efficient large-scale sequence comparison by locality-sensitive hashing | Zendy

Jeremy Buhler | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Efficient large-scale sequence comparison by locality-sensitive hashing

Author(s) -

Jeremy Buhler

Publication year - 2001

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/17.5.419

Subject(s) - locality sensitive hashing , computer science , sequence (biology) , hash function , locality , sequence alignment , algorithm , computational biology , theoretical computer science , hash table , genetics , biology , gene , peptide sequence , linguistics , philosophy , computer security

Comparison of multimegabase genomic DNA sequences is a popular technique for finding and annotating conserved genome features. Performing such comparisons entails finding many short local alignments between sequences up to tens of megabases in length. To process such long sequences efficiently, existing algorithms find alignments by expanding around short runs of matching bases with no substitutions or other differences. Unfortunately, exact matches that are short enough to occur often in significant alignments also occur frequently by chance in the background sequence. Thus, these algorithms must trade off between efficiency and sensitivity to features without long exact matches.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research