Orthology  G uided  A ssembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in   L olium perenne | Zendy

Ruttink Tom | Zendy; Sterck Lieven | Zendy; Rohde Antje | Zendy; Bendixen Christian | Zendy; Rouzé Pierre | Zendy; Asp Torben | Zendy; Van de Peer Yves | Zendy; RoldanRuiz Isabel | Zendy

Open Access

Orthology G uided A ssembly in highly heterozygous crops: creating a reference transcriptome to uncover genetic diversity in L olium perenne

Author(s) -

Ruttink Tom,

Sterck Lieven,

Rohde Antje,

Bendixen Christian,

Rouzé Pierre,

Asp Torben,

Van de Peer Yves,

RoldanRuiz Isabel

Publication year - 2013

Publication title -

plant biotechnology journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.525

H-Index - 115

eISSN - 1467-7652

pISSN - 1467-7644

DOI - 10.1111/pbi.12051

Subject(s) - biology , genetics , sequence assembly , de novo transcriptome assembly , contig , gene , reference genome , computational biology , transcriptome , outbreeding depression , genome , gene expression , population , demography , sociology , inbreeding

Summary Despite current advances in next‐generation sequencing data analysis procedures, de novo assembly of a reference sequence required for SNP discovery and expression analysis is still a major challenge in genetically uncharacterized, highly heterozygous species. High levels of polymorphism inherent to outbreeding crop species hamper D e B ruijn G raph‐based de novo assembly algorithms, causing transcript fragmentation and the redundant assembly of allelic contigs. If multiple genotypes are sequenced to study genetic diversity, primary de novo assembly is best performed per genotype to limit the level of polymorphism and avoid transcript fragmentation. Here, we propose an O rthology G uided A ssembly procedure that first uses sequence similarity (t BLAST n) to proteins of a model species to select allelic and fragmented contigs from all genotypes and then performs CAP 3 clustering on a gene‐by‐gene basis. Thus, we simultaneously annotate putative orthologues for each protein of the model species, resolve allelic redundancy and fragmentation and create a de novo transcript sequence representing the consensus of all alleles present in the sequenced genotypes. We demonstrate the procedure using RNA ‐seq data from 14 genotypes of L olium perenne to generate a reference transcriptome for gene discovery and translational research, to reveal the transcriptome‐wide distribution and density of SNP s in an outbreeding crop and to illustrate the effect of polymorphisms on the assembly procedure. The results presented here illustrate that constructing a non‐redundant reference sequence is essential for comparative genomics, orthology‐based annotation and candidate gene selection but also for read mapping and subsequent polymorphism discovery and/or read count‐based gene expression analysis.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research