Premium
Accurate SHAPE‐directed RNA structure prediction
Author(s) -
Deigan Katherine E,
Li Tian W,
Mathews David H,
Weeks Kevin M
Publication year - 2009
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.23.1_supplement.843.2
Subject(s) - protein secondary structure , rna , nucleic acid structure , nucleic acid secondary structure , computational biology , sequence (biology) , base pair , algorithm , computer science , biology , topology (electrical circuits) , biological system , genetics , gene , mathematics , combinatorics , biochemistry
Almost all RNAs can fold to form extensive secondary structures. Many of these structures then modulate numerous elements of gene expression. Deducing these structure‐function relationships requires that it be possible to predict RNA secondary structure accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted structure reliably represents the correct structure, has remained an unsolved problem. Here we demonstrate that quantitative, nucleotide‐resolution information from a SHAPE experiment can be interpreted as a pseudo‐free energy change term and used to determine RNA secondary structure with high accuracy. We use three metrics to evaluate the prediction accuracy for E. coli 16S rRNA (1542 nts). Taking the structure determined by comparative sequence analysis as the standard, we correctly predict 90% of all phylogenetically supported base pairs. Allowing for experimentally supported local refolding relative to the phylogenetic structure, the prediction accuracy is 95%. As judged by the ability to identify helices of 3 base pairs or greater, and thus the overall topology of the RNA, the prediction accuracy is again 95%. This work demonstrates that, given sufficient quantitative in‐solution information, it is possible to predict the structure of an important subset of RNAs with accuracies comparable to those achievable by comparative sequence analysis.