z-logo
open-access-imgOpen Access
The pan‐genome of the cultivated soybean (PanSoy) reveals an extraordinarily conserved gene content
Author(s) -
Torkamaneh Davoud,
Lemay MarcAndré,
Belzile François
Publication year - 2021
Publication title -
plant biotechnology journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.525
H-Index - 115
eISSN - 1467-7652
pISSN - 1467-7644
DOI - 10.1111/pbi.13600
Subject(s) - biology , genome , gene , genetics , structural variation , genome size , reference genome , genomics , comparative genomics , glycine soja , glycine , amino acid
Summary Studies on structural variation in plants have revealed the inadequacy of a single reference genome for an entire species and suggest that it is necessary to build a species‐representative genome called a pan‐genome to better capture the extent of both structural and nucleotide variation. Here, we present a pan‐genome of cultivated soybean ( Glycine max ), termed PanSoy, constructed using the de novo genome assembly of 204 phylogenetically and geographically representative improved accessions selected from the larger GmHapMap collection. PanSoy uncovers 108 Mb (˜11%) of novel nonreference sequences encompassing 3621 protein‐coding genes (including 1659 novel genes) absent from the soybean ‘Williams 82’ reference genome. Nonetheless, the core genome represents an exceptionally large proportion of the genome, with >90.6% of genes being shared by >99% of the accessions. A majority of PAVs encompassing genes could be confirmed with long‐read sequencing on a subset of accessions. The PanSoy is a major step towards capturing the extent of genetic variation in cultivated soybean and provides a resource for soybean genomics research and breeding.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here