z-logo
Premium
Modeling Tanimoto Similarity Value Distributions and Predicting Search Results
Author(s) -
Vogt Martin,
Bajorath Jürgen
Publication year - 2017
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201600131
Subject(s) - cheminformatics , similarity (geometry) , fingerprint (computing) , computer science , nearest neighbor search , data mining , chemical database , rank (graph theory) , artificial intelligence , molecular descriptor , virtual screening , value (mathematics) , quantitative structure–activity relationship , pattern recognition (psychology) , machine learning , mathematics , bioinformatics , drug discovery , combinatorics , image (mathematics) , biology
Similarity searching using molecular fingerprints has a long history in chemoinformatics and continues to be a popular approach for virtual screening. Typically, known active reference molecules are used to search databases for new active compounds. However, this search has black box character because similarity value distributions are dependent on fingerprints and compound classes. Consequently, no generally applicable similarity threshold values are available as reliable indicators of activity relationships between reference and database compounds. Therefore, it is generally uncertain where new active compounds might appear in database rankings, if at all. In this contribution, methods are discussed for modeling similarity value distributions of fingerprint search calculations using Tanimoto coefficients and estimating rank positions of active compounds. To our knowledge, these are the first approaches for predicting the results of fingerprint‐based similarity searching.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here