Premium
RelACCS‐FP: A Structural Minimalist Approach to Fingerprint Design
Author(s) -
Hu Ye,
Lounkine Eugen,
Batista José,
Bajorath Jürgen
Publication year - 2008
Publication title -
chemical biology and drug design
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.59
H-Index - 77
eISSN - 1747-0285
pISSN - 1747-0277
DOI - 10.1111/j.1747-0285.2008.00723.x
Subject(s) - fingerprint (computing) , computer science , similarity (geometry) , bit array , metric (unit) , class (philosophy) , data mining , pattern recognition (psychology) , selection (genetic algorithm) , artificial intelligence , type (biology) , biology , ecology , operations management , economics , image (mathematics)
The design and evaluation of structural key‐type fingerprints is reported that consist of only 10–30 substructures isolated from randomly generated fragment populations of different classes of active compounds. To identify minimal sets of fragments that carry substantial compound class‐specific information, fragment frequency calculations are applied to guide fingerprint generation. These compound class‐directed and extremely small structural fingerprints push the design of so‐called mini‐fingerprints to the limit and are the shortest bit string fingerprints reported to date. For the application of relative frequency‐based activity class characteristic substructure fingerprints, a bit density‐dependent similarity metric is introduced that makes it possible to adjust similarity coefficients for individual compound classes and balance the recall of active compounds with database selection size. In similarity search trials, these small compound class‐directed fingerprints enrich active compounds in relatively small database selection sets and approach or exceed the performance of widely used structural fingerprints of much larger size and higher complexity.