Premium
Effective double‐digest RAD sequencing and genotyping despite large genome size
Author(s) -
Gargiulo Roberta,
Kull Tiiu,
Fay Michael F.
Publication year - 2021
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.13314
Subject(s) - biology , genome , inference , computational biology , workflow , set (abstract data type) , population , dna sequencing , genomics , genotyping , data mining , evolutionary biology , genetics , computer science , genotype , gene , artificial intelligence , database , demography , sociology , programming language
Abstract Obtaining informative data is the ambition of any genomic project, but in nonmodel species with very large genomes, pursuing such a goal requires surmounting a series of analytical challenges. Double‐digest RAD sequencing is routinely used in nonmodel organisms and offers some control over the volume of data obtained. However, the volume of data recovered is not always an indication of the reliability of data sets, and quality checks are necessary to ensure that true and artefactual information is set apart. In the present study, we aim to fill the gap existing between the known applicability of RAD sequencing methods in plants with large genomes and the use of the retrieved loci for population genetic inference. By analysing two populations of Cypripedium calceolus , a nonmodel orchid species with a large genome size (1C ~ 31.6 Gbp), we provide a complete workflow from library preparation to bioinformatic filtering and inference of genetic diversity and differentiation. We show how filtering strategies to dismiss potentially misleading data need to be explored and adapted to data set‐specific features. Moreover, we suggest that the occurrence of organellar sequences in libraries should not be neglected when planning the experiment and analysing the results. Finally, we explain how, in the absence of prior information about the genome of the species, seeking high standards of quality during library preparation and sequencing can provide an insurance against unpredicted technical or biological constraints.