VSS: variance-stabilized signals for sequencing-based genomic signals
Author(s) -
Faezeh Bayat,
Maxwell W. Libbrecht
Publication year - 2021
Publication title -
bioinformatics
Language(s) - Uncategorized
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab457
Subject(s) - variance (accounting) , computer science , imputation (statistics) , data mining , computational biology , biology , machine learning , missing data , accounting , business
A sequencing-based genomic assay such as ChIP-seq outputs a real-valued signal for each position in the genome that measures the strength of activity at that position. Most genomic signals lack the property of variance stabilization. That is, a difference between 0 and 100 reads usually has a very different statistical importance from a difference between 1000 and 1100 reads. A statistical model such as a negative binomial distribution can account for this pattern, but learning these models is computationally challenging. Therefore, many applications-including imputation and segmentation and genome annotation (SAGA)-instead use Gaussian models and use a transformation such as log or inverse hyperbolic sine (asinh) to stabilize variance.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom