Premium
An Integrative Segmentation Method for Detecting Germline Copy Number Variations in SNP Arrays
Author(s) -
Shi Jianxin,
Li Peng
Publication year - 2012
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.21631
Subject(s) - international hapmap project , copy number variation , hidden markov model , biology , single nucleotide polymorphism , genetics , genotyping , computational biology , structural variation , 1000 genomes project , segmentation , snp genotyping , snp , genome , computer science , artificial intelligence , gene , genotype
Germline copy number variations (CNVs) are a major source of genetic variation in humans. In large‐scale studies of complex diseases, CNVs are usually detected from data generated by single nucleotide polymorphism (SNP) genotyping arrays. In this paper, we develop an integrative segmentation method, SegCNV, for detecting CNVs integrating both log R ratio (LRR) and B allele frequency (BAF). Based on simulation studies, SegCNV had modestly better power to detect deletions and substantially better power to detect duplications compared with circular binary segmentation (CBS) that relies purely on LRRs; and it had better power to detect deletions and a comparable performance to detect duplications compared with PennCNV and QuantiSNP. In two Hapmap subjects with deep sequence data available as a gold standard, SegCNV detected more true short deletions than PennCNV and QuantiSNP. For 21 short duplications validated experimentally in the AGRE dataset, SegCNV, QuantiSNP, and PennCNV detected all of them while CBS detected only three. SegCNV is much faster than the HMM‐based (where HMM is hidden Markov model) methods, taking only several seconds to analyze genome‐wide data for one subject. Genet. Epidemiol. 36:373–383, 2012. © 2012 Wiley Periodicals, Inc.