Premium
Self‐contained subsets method for estimation of gene frequencies of truncated genetic data
Author(s) -
Tai John J.
Publication year - 1997
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/(sici)1098-2272(1997)14:5<465::aid-gepi2>3.0.co;2-0
Subject(s) - subdivision , estimation , mathematics , data set , computation , statistics , set (abstract data type) , variance (accounting) , algorithm , computer science , economics , business , management , archaeology , accounting , history , programming language
The self‐contained subsets method subdivides a genetic data set into a number of subsets from which estimates are computed. The advantage of such a method is that when a subset is suspected of containing unreliable data then discarding that subset will not invalidate the remaining subsets for estimation. Thus, the complicated computation required to deal with truncated data can be avoided. In this paper, using estimation of gene frequencies as an example for one subset case, the marginal distribution of a subset total is derived and then, using this distribution, the variance of the frequency estimate of a gene frequency from the subdivision method is calculated. The subdivision method is also applied to impute the number of people of a truncated group. Finally, more complex cases where there are multiple subsets available for estimation are discussed. Results are compared to those of previous studies. Genet. Epidemiol. 14:465–477,1997. © 1997 Wiley‐Liss, Inc.