Premium
Probabilistic Substructure Mining From Small‐Molecule Screens
Author(s) -
Ranu Sayan,
Calhoun Bradley T.,
Singh Ambuj K.,
Swamidass S. Joshua
Publication year - 2011
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201100058
Subject(s) - substructure , cheminformatics , probabilistic logic , diamondoid , small molecule , computer science , data mining , molecule , computational biology , chemistry , artificial intelligence , computational chemistry , engineering , biology , structural engineering , biochemistry , organic chemistry
Identifying the overrepresented substructures from a set of molecules with similar activity is a common task in chemical informatics. Existing substructure miners are deterministic, requiring the activity of all mined molecules to be known with high confidence. In contrast, we introduce pGraphSig, a probabilistic structure miner, which effectively mines structures from noisy data, where many molecules are labeled with their probability of being active. We benchmark pGraphSig on data from several small‐molecule high throughput screens, finding that it can more effectively identify overrepresented structures than a deterministic structure miner.