Tree and Hashing Data Structures to Speed up Chemical Searches: Analysis and Experiments | Zendy

Nasr Ramzi | Zendy; Kristensen Thomas | Zendy; Baldi Pierre | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Tree and Hashing Data Structures to Speed up Chemical Searches: Analysis and Experiments

Author(s) -

Nasr Ramzi,

Kristensen Thomas,

Baldi Pierre

Publication year - 2011

Publication title -

molecular informatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.481

H-Index - 68

eISSN - 1868-1751

pISSN - 1868-1743

DOI - 10.1002/minf.201100089

Subject(s) - cheminformatics , computer science , locality sensitive hashing , pruning , chemical space , data mining , chemical database , hash function , tree (set theory) , nearest neighbor search , skyline , hash table , theoretical computer science , mathematics , chemistry , computational chemistry , mathematical analysis , biochemistry , computer security , organic chemistry , agronomy , biology , drug discovery

In many large chemoinformatics database systems, molecules are represented by long binary fingerprint vectors whose components record the presence or absence of particular functional groups or combinatorial features. For a given query molecule, one is interested in retrieving all the molecules in the database with a similarity to the query above a certain threshold. Here we describe a method for speeding up chemical searches in these large databases of small molecules by combining previously developed tree and hashing data structures to prune the search space without any false negatives. More importantly, we provide a mathematical analysis that allows one to predict the level of pruning, and validate the quality of the predictions of the method through simulation experiments.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research