Premium
Diversity and population structure of northern switchgrass as revealed through exome capture sequencing
Author(s) -
Evans Joseph,
Crisovan Emily,
Barry Kerrie,
Daum Chris,
Jenkins Jerry,
KundeRamamoorthy Govindarajan,
Nandety Aruna,
Ngan Chew Yee,
Vaillancourt Brieanne,
Wei ChiaLin,
Schmutz Jeremy,
Kaeppler Shawn M.,
Casler Michael D.,
Buell Carol Robin
Publication year - 2015
Publication title -
the plant journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.058
H-Index - 269
eISSN - 1365-313X
pISSN - 0960-7412
DOI - 10.1111/tpj.13041
Subject(s) - ecotype , biology , panicum virgatum , population , genetic diversity , genetics , single nucleotide polymorphism , polyploid , genome , genotype , gene , evolutionary biology , ecology , bioenergy , demography , sociology , renewable energy
Summary Panicum virgatum L. (switchgrass) is a polyploid, perennial grass species that is native to North America, and is being developed as a future biofuel feedstock crop. Switchgrass is present primarily in two ecotypes: a northern upland ecotype, composed of tetraploid and octoploid accessions, and a southern lowland ecotype, composed of primarily tetraploid accessions. We employed high‐coverage exome capture sequencing (~2.4 Tb) to genotype 537 individuals from 45 upland and 21 lowland populations. From these data, we identified ~27 million single‐nucleotide polymorphisms ( SNP s), of which 1 590 653 high‐confidence SNP s were used in downstream analyses of diversity within and between the populations. From the 66 populations, we identified five primary population groups within the upland and lowland ecotypes, a result that was further supported through genetic distance analysis. We identified conserved, ecotype‐restricted, non‐synonymous SNP s that are predicted to affect the protein function of CONSTANS ( CO ) and EARLY HEADING DATE 1 ( EHD 1 ), key genes involved in flowering, which may contribute to the phenotypic differences between the two ecotypes. We also identified, relative to the near‐reference Kanlow population, 17 228 genes present in more copies than in the reference genome (up‐CNVs), 112 630 genes present in fewer copies than in the reference genome (down‐CNVs) and 14 430 presence/absence variants ( PAV s), affecting a total of 9979 genes, including two upland‐specific CNV clusters. In total, 45 719 genes were affected by an SNP , CNV , or PAV across the panel, providing a firm foundation to identify functional variation associated with phenotypic traits of interest for biofuel feedstock production.