z-logo
open-access-imgOpen Access
European American Stratification in Ovarian Cancer Case Control Data: The Utility of Genome-Wide Data for Inferring Ancestry
Author(s) -
Paola Raska,
Edwin S. Iversen,
Ann Chen,
Zhihua Chen,
Brooke L. Fridley,
Jennifer Permuth-Wey,
Ya Yu Tsai,
Robert A. Vierkant,
Ellen L. Goode,
Harvey A. Risch,
Joellen M. Schildkraut,
Thomas A. Sellers,
Jill S. BarnholtzSloan
Publication year - 2012
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0035235
Subject(s) - population stratification , biology , genome , genome wide association study , ancestry informative marker , genetics , population , 1000 genomes project , genetic variation , evolutionary biology , computational biology , allele frequency , gene , single nucleotide polymorphism , allele , genotype , demography , sociology
We investigated the ability of several principal components analysis (PCA)-based strategies to detect and control for population stratification using data from a multi-center study of epithelial ovarian cancer among women of European-American ethnicity. These include a correction based on an ancestry informative markers (AIMs) panel designed to capture European ancestral variation and corrections utilizing un-thinned genome-wide SNP data; case-control samples were drawn from four geographically distinct North-American sites. The AIMs-only and genome-wide first principal components (PC1) both corresponded to the previously described North or Northwest-Southeast axis of European variation. We found that the genome-wide PCA captured this primary dimension of variation more precisely and identified additional axes of genome-wide variation of relevance to epithelial ovarian cancer. Associations evident between the genome-wide PCs and study site corroborate North American immigration history and suggest that undiscovered dimensions of variation lie within Northern Europe. The structure captured by the genome-wide PCA was also found within control individuals and did not reflect the case-control variation present in the data. The genome-wide PCA highlighted three regions of local LD, corresponding to the lactase (LCT) gene on chromosome 2, the human leukocyte antigen system (HLA) on chromosome 6 and to a common inversion polymorphism on chromosome 8. These features did not compromise the efficacy of PCs from this analysis for ancestry control. This study concludes that although AIMs panels are a cost-effective way of capturing population structure, genome-wide data should preferably be used when available.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here