z-logo
open-access-imgOpen Access
Machine learning approaches to identify core and dispensable genes in pangenomes
Author(s) -
Yocca Alan E.,
Edger Patrick P.
Publication year - 2022
Publication title -
the plant genome
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.403
H-Index - 41
ISSN - 1940-3372
DOI - 10.1002/tpg2.20135
Subject(s) - biology , genome , gene , brachypodium distachyon , computational biology , genetics , oryza sativa , brachypodium , leverage (statistics) , artificial intelligence , computer science
A gene in a given taxonomic group is either present in every individual (core) or absent in at least a single individual (dispensable). Previous pangenomic studies have identified certain functional differences between core and dispensable genes. However, identifying if a gene belongs to the core or dispensable portion of the genome requires the construction of a pangenome, which involves sequencing the genomes of many individuals. Here we aim to leverage the previously characterized core and dispensable gene content for two grass species [ Brachypodium distachyon (L.) P. Beauv. and Oryza sativa L.] to construct a machine learning model capable of accurately classifying genes as core or dispensable using only a single annotated reference genome. Such a model may mitigate the need for pangenome construction, an expensive hurdle especially in orphan crops, which often lack the adequate genomic resources.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here