
A Maximum Entropy Formalism for Disentangling Chains of Correlated Sequence Positions
Author(s) -
Alan S. Lapedes,
B. G. Giraud,
L.C. Liu,
Gary D. Stormo
Publication year - 1998
Language(s) - English
Resource type - Reports
DOI - 10.2172/763147
Subject(s) - formalism (music) , computational biology , rna , entropy (arrow of time) , statistical physics , protein secondary structure , sequence (biology) , principle of maximum entropy , algorithm , mathematics , computer science , biology , genetics , physics , artificial intelligence , gene , quantum mechanics , musical , visual arts , art , biochemistry
Covariation analysis of sets of aligned sequences of protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In contrast, covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. The goals of this paper are to (1) present the problem, (2) develop the mathematical formalism for solving the problem, and (3) validate the resulting algorithms on simulated data. Extensive application to biological sequences will be presented elsewhere