Nucleic Acids Research
Author(s) -
Pasquale De Santis
Publication year - 2014
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gku006
Subject(s) - biology , nucleic acid , computational biology , dna , biochemistry , genetics
We present a computer method to determine nucleic acid secondary structures. It is based on three steps: 1) the search for all possible helical regions relied on a mathematical approach derived from the convolution theorem; it uses a tetradimensional complex vector representation of the bases along the sequence; 2) a 'tree' search for a set of minimum free energy structures, by the aid of an approximate energy evaluation to reduce the computer time requirements; 3) the exact calculation and refinement of the energies. A method to introduce the experimental data and reach an arrangement between them and the free energy minimization criterion is shown. In order to demonstrate the confidence of the program a test on four RNA sequences is performed. The method has computer time requirement proportional to N, where N is the length of the sequence and retrieves a set of optimal free energy structures. INTRODUCTION The importance of the role of secondary and tertiary RNA structure in biological processes is grown in the last years (1 -3 ) . It is now generally assumed that primary sequence carries the information required for its actual three-dimensional folding. Predicting secondary structure first and then proceeding on to tertiary structure can be supposed to be a fruitful, if not infallible, approach. With the increase in length and number of the determinated nucleic acid sequences, there has been a growing need for algorithms that can efficiently search for the more probable secondary structures. Several methods have been developed to this aim by minimizing the free energy (4—17). Some of these methods predict only one optimal free energy structure for each nucleotide sequence. Alternate equivalent and suboptimal free energy structures are not identified despite their possible biological significance. The methods capable of identifying more than one optimal secondary structure generally require user intervention and/or external constraints (7,11,13). The computer programs based on all these methods generally require a computational time proportional to N, and memory requirements of the order of N, where N is the number of nucleotides in the sequence. The experimental (enzymatic, chemical) data, if considered are generally introduced in the programs as constraints. Here we present an algorithm able to select a set of optimal free energy structures, with computer time requirement proportional to N, and with the possibility of introducing a gradual competition between the experimental data and the free energy content of the structure.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom