z-logo
open-access-imgOpen Access
Genomic Data and Disease Forecasting: Application to Type 2 Diabetes (T2D)
Author(s) -
Lawrence Sirovich
Publication year - 2014
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0085684
Subject(s) - classifier (uml) , computational biology , disease , single nucleotide polymorphism , biology , genetics , computer science , allele , bioinformatics , artificial intelligence , genotype , medicine , gene , pathology
A general approach is presented for the extraction of a classifier of disease risk that is latent in large scale disease/control databases. Novel features are the following: (1) a data reorganization into a regularized standard form that emphasizes individual alleles instead of the single nucleotide polymorphism (Snp) allele pair to which they belong; (2) from this a procedure that significantly enhances the discovery of high value genomic loci; (3) an investigative analysis based on the hypothesis that disease represents a very small signal (small signal-to-noise) that is latent in the data. The resulting analyses applied to the FUSION T2D database leads to the polling of thousands of genomic loci to classify disease. This large genomic kernel of loci is shared by non-diabetics at nearly the same high level; but a small well defined separation exists and it is speculated that this might be due to unconventional disease mechanisms. Another analysis demonstrates that the FUSION database size limits its disease predictability, and only one third of the resulting classifier loci are estimated to relate to T2D. The remainder is associated with hidden features that might contrast the disease and control populations and that more data would eliminate.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here