z-logo
open-access-imgOpen Access
Continuous Embeddings of DNA Sequencing Reads and Application to Metagenomics
Author(s) -
Romain Menegaux,
JeanPhilippe Vert
Publication year - 2019
Publication title -
journal of computational biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.585
H-Index - 95
eISSN - 1557-8666
pISSN - 1066-5277
DOI - 10.1089/cmb.2018.0174
Subject(s) - metagenomics , dna sequencing , scalability , computer science , computational biology , k mer , dna , biology , artificial intelligence , genetics , gene , database
We propose a new model for fast classification of DNA sequences output by next-generation sequencing machines. The model, which we call fastDNA, embeds DNA sequences in a vector space by learning continuous low-dimensional representations of the k -mers it contains. We show on metagenomics benchmarks that it outperforms the state-of-the-art methods in terms of accuracy and scalability.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom