Premium
Development of highly reliable in silico SNP resource and genotyping assay from exome capture and sequencing: an example from black spruce ( Picea mariana )
Author(s) -
Pavy Nathalie,
Gag France,
Deschênes Astrid,
Boyle Brian,
Beaulieu Jean,
Bousquet Jean
Publication year - 2016
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.12468
Subject(s) - biology , in silico , genotyping , snp genotyping , population , genetics , single nucleotide polymorphism , genomics , exome , computational biology , exome sequencing , population genomics , snp , molecular inversion probe , genome , genotype , gene , mutation , demography , sociology
Abstract Picea mariana is a widely distributed boreal conifer across Canada and the subject of advanced breeding programmes for which population genomics and genomic selection approaches are being developed. Targeted sequencing was achieved after capturing P. mariana exome with probes designed from the sequenced transcriptome of Picea glauca , a distant relative. A high capture efficiency of 75.9% was reached although spruce has a complex and large genome including gene sequences interspersed by some long introns. The results confirmed the relevance of using probes from congeneric species to perform successfully interspecific exome capture in the genus Picea . A bioinformatics pipeline was developed including stringent criteria that helped detect a set of 97 075 highly reliable in silico SNP s. These SNP s were distributed across 14 909 genes. Part of an Infinium iS elect array was used to estimate the rate of true positives by validating 4267 of the predicted in silico SNP s by genotyping trees from P. mariana populations. The true positive rate was 96.2% for in silico SNP s, compared to a genotyping success rate of 96.7% for a set 1115 P. mariana control SNP s recycled from previous genotyping arrays. These results indicate the high success rate of the genotyping array and the relevance of the selection criteria used to delineate the new P. mariana in silico SNP resource. Furthermore, in silico SNP s were generally of medium to high frequency in natural populations, thus providing high informative value for future population genomics applications.