
Intragenomic Profiling Using Multicopy Genes: The rDNA Internal Transcribed Spacer Sequences of the Freshwater Sponge Ephydatia fluviatilis
Author(s) -
Liisi Karlep,
Tõnu Reintamm,
Merike Kelve
Publication year - 2013
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0066601
Subject(s) - amplicon , biology , genetics , internal transcribed spacer , genome , gene , concerted evolution , dna sequencing , gc content , ribosomal rna , computational biology , whole genome sequencing , deep sequencing , polymerase chain reaction
Multicopy genes, like ribosomal RNA genes (rDNA), are widely used to describe and distinguish individuals. Despite concerted evolution that homogenizes a large number of rDNA gene copies, the presence of different gene variants within a genome has been reported. Characterization of an organism by defining every single variant of tens to thousands of rDNA repeat units present in a eukaryotic genome would be quite unreasonable. Here we provide an alternative approach for the characterization of a set of internal transcribed spacer sequences found within every rDNA repeat unit by implementing direct sequencing methodology. The prominent allelic variants and their relative amounts characterizing an individual can be described by a single sequencing electropherogram of the mixed amplicon containing the variants present within the genome. We propose a method for rational analysis of heterogeneity of multicopy genes by compiling a profile based on quantification of different sequence variants of the internal transcribed spacers of the freshwater sponge Ephydatia fluviatilis as an example. In addition to using conventional substitution analysis, we have developed a mathematical method, the proportion model method, to quantify the relative amounts of allelic variants of different length using data from direct sequencing of the heterogeneous amplicon. This method is based on determining the expected signal intensity values (corresponding to peak heights from the sequencing electropherogram) by sequencing clones from the same or highly similar amplicon and comparing hypothesized combinations against the values obtained by direct sequencing of the heterogeneous amplicon. This method allowed to differentiate between all specimens analysed.