z-logo
open-access-imgOpen Access
The Site-Specific Amino Acid Preferences of Homologous Proteins Depend on Sequence Divergence
Author(s) -
Evandro Ferrada
Publication year - 2018
Publication title -
genome biology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.702
H-Index - 74
ISSN - 1759-6653
DOI - 10.1093/gbe/evy261
Subject(s) - biology , divergence (linguistics) , phylogenetic tree , protein superfamily , sequence (biology) , homologous chromosome , genetics , conserved sequence , function (biology) , sequence alignment , amino acid , stability (learning theory) , peptide sequence , computational biology , multiple sequence alignment , molecular evolution , protein sequencing , evolutionary biology , gene , computer science , machine learning , philosophy , linguistics
The propensity of protein sites to be occupied by any of the 20 amino acids is known as site-specific amino acid preferences (SSAP). Under the assumption that SSAP are conserved among homologs, they can be used to parameterize evolutionary models for the reconstruction of accurate phylogenetic trees. However, simulations and experimental studies have not been able to fully assess the relative conservation of SSAP as a function of sequence divergence between protein homologs. Here, we implement a computational procedure to predict the SSAP of proteins based on the effect of changes in thermodynamic stability upon mutation. An advantage of this computational approach is that it allows us to interrogate a large and unbiased sample of homologous proteins, over the entire spectrum of sequence divergence, and under selection for the same molecular trait. We show that computational predictions have reproducibilities that resemble those obtained in experimental replicates, and can largely recapitulate the SSAP observed in a large-scale mutagenesis experiment. Our results support recent experimental reports on the conservation of SSAP of related homologs, with a slowly increasing fraction of up to 15% of different sites at sequence distances lower than 40%. However, even under the sole contribution of thermodynamic stability, our conservative approach identifies up to 30% of significant different sites between divergent homologs. We show that this relation holds for homologs of diverse sizes and structural classes. Analyses of residue contact networks suggest that an important determinant of these differences is the increasing accumulation of structural deviations that results from sequence divergence.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here