z-logo
open-access-imgOpen Access
Toward a Dynamic Threshold for Quality Score Distortion in Reference-Based Alignment
Author(s) -
Ana A. Hernandez-Lopez,
Claudio Alberti,
Marco Mattavelli
Publication year - 2019
Publication title -
journal of computational biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.585
H-Index - 95
eISSN - 1557-8666
pISSN - 1066-5277
DOI - 10.1089/cmb.2019.0333
Subject(s) - distortion (music) , computer science , artificial intelligence , mathematics , telecommunications , amplifier , bandwidth (computing)
The intrinsic high-entropy sequence metadata, known as quality scores, is largely the cause of the substantial size of sequence data files. Yet, there is no consensus on a viable reduction of the resolution of the quality score scale, arguably because of collateral side effects. In this article, we leverage on the penalty functions of HISAT2 aligner to rebin the quality score scale in such a way as to avoid any impact on sequence alignment, identifying alongside a distortion threshold for "safe" quality score representation. We tested our findings on whole-genome and RNA-seq data, and contrasted the results with three methods for lossy compression of the quality scores.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom