Premium
Enriching Nanomaterials Omics Data: An Integration Technique to Generate Biological Descriptors
Author(s) -
Tsiliki Georgia,
Nymark Penny,
Kohonen Pekka,
Grafström Roland,
Sarimveis Haralambos
Publication year - 2017
Publication title -
small methods
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 4.66
H-Index - 46
ISSN - 2366-9608
DOI - 10.1002/smtd.201700139
Subject(s) - omics , profiling (computer programming) , computer science , biological data , data integration , data mining , proteomics , computational biology , machine learning , bioinformatics , biology , gene , biochemistry , operating system
The interest toward omics data is growing in the field of toxicology owing to the diverse knowledge they generate, which can improve prediction and dosage profiling for more accurate safety assessment. An integration methodology is presented where high‐throughput omics data are enriched with biological‐pathway information to produce a novel set of biological (BIO) descriptors by decomposing omics data to meaningful clusters in terms of both their mechanistic interpretation and correlation affinity. A generalized simulated annealing algorithm is employed to estimate the optimal partition of the enriched data and accordingly produce novel descriptors based on gene content similarity. BIO descriptors are characterized by the pathway information fused to the data; thereby, they refer to groups of genes with similar biological implications rather than specific genes, which could vary across studies. The methodology is applied to an extensive proteomics data set and demonstrates that BIO descriptors are beneficial for modeling prediction, outperforming the prediction accuracy of the original omics data, and offering a readily available biological interpretation of the findings.