Premium
Handling biological sequences in R with the bioseq package
Author(s) -
Keck François
Publication year - 2020
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.13490
Subject(s) - r package , biological data , computer science , software package , sequence (biology) , simple (philosophy) , software , biological database , dna sequencing , computational biology , theoretical computer science , data mining , biology , programming language , bioinformatics , dna , genetics , philosophy , epistemology
With the democratization of molecular biology, more and more ecologists are required to analyse complex datasets including biological sequences. I present the R package bioseq , a comprehensive toolset to handle biological sequences in R. The package implements three classes to work with DNA, RNA and amino acid sequences. Biological sequences are stored as simple vectors of character strings with restricted alphabets. This storage mode makes it easy to work with dataframes. The package includes a collection of functions to perform basic editing operations on sequences and biological conversion among classes. The package can read and convert data from and to several formats. Thus, users can benefit from the richness of the R environment to perform advanced biological sequence analysis. The bioseq package is a free software intended to be of significant value to researchers who need a simple and solid framework to analyse biological sequence data in R.