HyDRA: gene prioritization via hybrid distance-score rank aggregation | Zendy

Minji Kim | Zendy; Farzad Farnoud | Zendy; Olgica Milenković | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

HyDRA: gene prioritization via hybrid distance-score rank aggregation

Author(s) -

Minji Kim,

Farzad Farnoud,

Olgica Milenković

Publication year - 2014

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btu766

Subject(s) - computer science , similarity (geometry) , weighting , aggregate (composite) , context (archaeology) , data mining , consistency (knowledge bases) , set (abstract data type) , machine learning , quality (philosophy) , feature (linguistics) , artificial intelligence , biology , philosophy , linguistics , materials science , epistemology , composite material , image (mathematics) , radiology , programming language , medicine , paleontology

Gene prioritization refers to a family of computational techniques for inferring disease genes through a set of training genes and carefully chosen similarity criteria. Test genes are scored based on their average similarity to the training set, and the rankings of genes under various similarity criteria are aggregated via statistical methods. The contributions of our work are threefold: (i) first, based on the realization that there is no unique way to define an optimal aggregate for rankings, we investigate the predictive quality of a number of new aggregation methods and known fusion techniques from machine learning and social choice theory. Within this context, we quantify the influence of the number of training genes and similarity criteria on the diagnostic quality of the aggregate and perform in-depth cross-validation studies; (ii) second, we propose a new approach to genomic data aggregation, termed HyDRA (Hybrid Distance-score Rank Aggregation), which combines the advantages of score-based and combinatorial aggregation techniques. We also propose incorporating a new top-versus-bottom (TvB) weighting feature into the hybrid schemes. The TvB feature ensures that aggregates are more reliable at the top of the list, rather than at the bottom, since only top candidates are tested experimentally; (iii) third, we propose an iterative procedure for gene discovery that operates via successful augmentation of the set of training genes by genes discovered in previous rounds, checked for consistency.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research