
cgpCaVEManWrapper: Simple Execution of CaVEMan in Order to Detect Somatic Single Nucleotide Variants in NGS Data
Author(s) -
Jones David,
Raine Keiran M.,
Davies Helen,
Tarpey Patrick S.,
Butler Adam P.,
Teague Jon W.,
NikZainal Serena,
Campbell Peter J.
Publication year - 2016
Publication title -
current protocols in bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.535
H-Index - 57
eISSN - 1934-340X
pISSN - 1934-3396
DOI - 10.1002/cpbi.20
Subject(s) - computer science , set (abstract data type) , simple (philosophy) , probabilistic logic , sample (material) , substitution (logic) , somatic cell , algorithm , computational biology , data mining , genetics , biology , artificial intelligence , gene , programming language , chemistry , chromatography , philosophy , epistemology
CaVEMan is an expectation maximization–based somatic substitution‐detection algorithm that is written in C. The algorithm analyzes sequence data from a test sample, such as a tumor relative to a reference normal sample from the same patient and the reference genome. It performs a comparative analysis of the tumor and normal sample to derive a probabilistic estimate for putative somatic substitutions. When combined with a set of validated post‐hoc filters, CaVEMan generates a set of somatic substitution calls with high recall and positive predictive value. Here we provide instructions for using a wrapper script called cgpCaVEManWrapper, which runs the CaVEMan algorithm and additional downstream post‐hoc filters. We describe both a simple one‐shot run of cgpCaVEManWrapper and a more in‐depth implementation suited to large‐scale compute farms. © 2016 by John Wiley & Sons, Inc.