Large-scale inference of population structure in presence of missingness using PCA
Author(s) -
Jonas Meisner,
Siyang Liu,
Mingxi Huang,
Anders Albrechtsen
Publication year - 2021
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab027
Subject(s) - missing data , python (programming language) , inference , population , computer science , principal component analysis , data mining , scale (ratio) , statistics , artificial intelligence , machine learning , mathematics , cartography , demography , sociology , operating system , geography
Principal component analysis (PCA) is a commonly used tool in genetics to capture and visualize population structure. Due to technological advances in sequencing, such as the widely used non-invasive prenatal test, massive datasets of ultra-low coverage sequencing are being generated. These datasets are characterized by having a large amount of missing genotype information.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom