TSSi—an R package for transcription start site identification from 5′ mRNA tag data
Author(s) -
Clemens Kreutz,
Julian Gehring,
Daniel Lang,
Ralf Reski,
Jens Timmer,
Stefan A. Rensing
Publication year - 2012
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bts189
Subject(s) - identification (biology) , transcription (linguistics) , computer science , messenger rna , r package , computational biology , software package , software , biology , genetics , gene , operating system , programming language , botany , linguistics , philosophy
High-throughput sequencing has become an essential experimental approach for the investigation of transcriptional mechanisms. For some applications like ChIP-seq, several approaches for the prediction of peak locations exist. However, these methods are not designed for the identification of transcription start sites (TSSs) because such datasets contain qualitatively different noise. In this application note, the R package TSSi is presented which provides a heuristic framework for the identification of TSSs based on 5' mRNA tag data. Probabilistic assumptions for the distribution of the data, i.e. for the observed positions of the mapped reads, as well as for systematic errors, i.e. for reads which map closely but not exactly to a real TSS, are made and can be adapted by the user. The framework also comprises a regularization procedure which can be applied as a preprocessing step to decrease the noise and thereby reduce the number of false predictions.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom