UniqueProt: creating representative protein sequence sets | Zendy

Sven Mika | Zendy; Burkhard Rost | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

UniqueProt: creating representative protein sequence sets

Author(s) -

Sven Mika,

Burkhard Rost

Publication year - 2003

Publication title -

nucleic acids research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 9.008

H-Index - 537

eISSN - 1362-4954

pISSN - 0305-1048

DOI - 10.1093/nar/gkg620

Subject(s) - biology , sequence (biology) , similarity (geometry) , cluster analysis , simple (philosophy) , service (business) , web service , value (mathematics) , data mining , computer science , world wide web , genetics , artificial intelligence , machine learning , philosophy , economy , epistemology , economics , image (mathematics)

UniqueProt is a practical and easy to use web service designed to create representative, unbiased data sets of protein sequences. The largest possible representative sets are found through a simple greedy algorithm using the HSSP-value to establish sequence similarity. UniqueProt is not a real clustering program in the sense that the 'representatives' are not at the centres of well-defined clusters since the definition of such clusters is problem-specific. Overall, UniqueProt is a reasonable fast solution for bias in data sets. The service is accessible at http://cubic.bioc.columbia.edu/services/uniqueprot; a command-line version for Linux is downloadable from this web site.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research