z-logo
open-access-imgOpen Access
Casboundary: automated definition of integral Cas cassettes
Author(s) -
Victor Alexandre Padilha,
Omer S. Alkhnbashi,
Tran Van Dinh,
Shiraz A. Shah,
André C. P. L. F. de Carvalho,
Rolf Backofen
Publication year - 2020
Publication title -
bioinformatics
Language(s) - Uncategorized
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa984
Subject(s) - computer science , identification (biology) , set (abstract data type) , crispr , genome , computational biology , annotation , data mining , gene , biology , genetics , artificial intelligence , botany , programming language
CRISPR-Cas are important systems found in most archaeal and many bacterial genomes, providing adaptive immunity against mobile genetic elements in prokaryotes. The CRISPR-Cas systems are encoded by a set of consecutive cas genes, here termed cassette. The identification of cassette boundaries is key for finding cassettes in CRISPR research field. This is often carried out by using Hidden Markov Models and manual annotation. In this article, we propose the first method able to automatically define the cassette boundaries. In addition, we present a Cas-type predictive model used by the method to assign each gene located in the region defined by a cassette's boundaries a Cas label from a set of pre-defined Cas types. Furthermore, the proposed method can detect potentially new cas genes and decompose a cassette into its modules.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom