MMseqs software suite for fast and deep clustering and searching of large protein sequence sets | Zendy

Maria Hauser | Zendy; Martin Steinegger | Zendy; Johannes Söding | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

MMseqs software suite for fast and deep clustering and searching of large protein sequence sets

Author(s) -

Maria Hauser,

Martin Steinegger,

Johannes Söding

Publication year - 2016

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btw006

Subject(s) - uniprot , computer science , cluster analysis , sequence database , software suite , data mining , software , metagenomics , smith–waterman algorithm , sequence alignment , database , artificial intelligence , peptide sequence , biology , biochemistry , gene , programming language

Sequence databases are growing fast, challenging existing analysis pipelines. Reducing the redundancy of sequence databases by similarity clustering improves speed and sensitivity of iterative searches. But existing tools cannot efficiently cluster databases of the size of UniProt to 50% maximum pairwise sequence identity or below. Furthermore, in metagenomics experiments typically large fractions of reads cannot be matched to any known sequence anymore because searching with sensitive but relatively slow tools (e.g. BLAST or HMMER3) through comprehensive databases such as UniProt is becoming too costly.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research