Improving sequence-based modeling of protein families using secondary-structure quality assessment
Author(s) -
Cyril Malbranke,
David Bikard,
Simona Cocco,
Rémi Monasson
Publication year - 2021
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab442
Subject(s) - computer science , pairwise comparison , sequence (biology) , matching (statistics) , data mining , protein superfamily , function (biology) , sequence alignment , quality (philosophy) , artificial intelligence , peptide sequence , biology , genetics , mathematics , statistics , gene , philosophy , epistemology
Modeling of protein family sequence distribution from homologous sequence data recently received considerable attention, in particular for structure and function predictions, as well as for protein design. In particular, direct coupling analysis, a method to infer effective pairwise interactions between residues, was shown to capture important structural constraints and to successfully generate functional protein sequences. Building on this and other graphical models, we introduce a new framework to assess the quality of the secondary structures of the generated sequences with respect to reference structures for the family.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom