Open Access
HCMMCNVs: hierarchical clustering mixture model of copy number variants detection using whole exome sequencing technology
Author(s) -
Chi Song,
ShihChi Su,
Zhiguang Huo,
Suleyman Vural,
James E. Galvin,
LunChing Chang
Publication year - 2021
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab183
Subject(s) - computer science , cluster analysis , mixture model , exome sequencing , copy number variation , hierarchical clustering , software , visualization , exome , expectation–maximization algorithm , data mining , computational biology , genome , artificial intelligence , biology , maximum likelihood , mutation , mathematics , genetics , statistics , gene , programming language
In this article, we introduce a hierarchical clustering and Gaussian mixture model with expectation-maximization (EM) algorithm for detecting copy number variants (CNVs) using whole exome sequencing (WES) data. The R shiny package 'HCMMCNVs' is also developed for processing user-provided bam files, running CNVs detection algorithm and conducting visualization. Through applying our approach to 325 cancer cell lines in 22 tumor types from Cancer Cell Line Encyclopedia (CCLE), we show that our algorithm is competitive with other existing methods and feasible in using multiple cancer cell lines for CNVs estimation. In addition, by applying our approach to WES data of 120 oral squamous cell carcinoma (OSCC) samples, our algorithm, using the tumor sample only, exhibits more power in detecting CNVs as compared with the methods using both tumors and matched normal counterparts.