z-logo
Premium
A novel statistical method for modeling covariate effects in bisulfite sequencing derived measures of DNA methylation
Author(s) -
Zhao Kaiqiong,
Oualkacha Karim,
LakhalChaieb Lajmi,
Labbe Aurélie,
Klein Kathleen,
Ciampi Antonio,
Hudson Marie,
Colmegna Inés,
Pastinen Tomi,
Zhang Tieyuan,
Daley Denise,
Greenwood Celia M.T.
Publication year - 2021
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.13307
Subject(s) - covariate , dna methylation , computer science , inference , bisulfite sequencing , confounding , computational biology , data mining , algorithm , statistics , biology , mathematics , artificial intelligence , machine learning , genetics , gene , gene expression
Identifying disease‐associated changes in DNA methylation can help us gain a better understanding of disease etiology. Bisulfite sequencing allows the generation of high‐throughput methylation profiles at single‐base resolution of DNA. However, optimally modeling and analyzing these sparse and discrete sequencing data is still very challenging due to variable read depth, missing data patterns, long‐range correlations, data errors, and confounding from cell type mixtures. We propose a regression‐based hierarchical model that allows covariate effects to vary smoothly along genomic positions and we have built a specialized EM algorithm, which explicitly allows for experimental errors and cell type mixtures, to make inference about smooth covariate effects in the model. Simulations show that the proposed method provides accurate estimates of covariate effects and captures the major underlying methylation patterns with excellent power. We also apply our method to analyze data from rheumatoid arthritis patients and controls. The method has been implemented in R package SOMNiBUS .

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here