z-logo
open-access-imgOpen Access
UDSMProt: universal deep sequence models for protein classification
Author(s) -
Nils Strodthoff,
Patrick Wagner,
Markus Wenzel,
Wojciech Samek
Publication year - 2020
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa003
Subject(s) - computer science , artificial intelligence , source code , machine learning , field (mathematics) , protein sequencing , task (project management) , representation (politics) , sequence (biology) , class (philosophy) , code (set theory) , deep learning , data mining , peptide sequence , programming language , gene , biochemistry , chemistry , genetics , mathematics , management , set (abstract data type) , politics , biology , economics , political science , pure mathematics , law
Inferring the properties of a protein from its amino acid sequence is one of the key problems in bioinformatics. Most state-of-the-art approaches for protein classification are tailored to single classification tasks and rely on handcrafted features, such as position-specific-scoring matrices from expensive database searches. We argue that this level of performance can be reached or even be surpassed by learning a task-agnostic representation once, using self-supervised language modeling, and transferring it to specific tasks by a simple fine-tuning step.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom