Systematic Extraction of Analogue Series from Large Compound Collections Using a New Computational Compound–Core Relationship Method
Author(s) -
J. Jesús Naveja,
Martin Vogt,
Dagmar Stumpfe,
José L. MedinaFranco,
Jürgen Bajorath
Publication year - 2019
Publication title -
acs omega
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.779
H-Index - 40
ISSN - 2470-1343
DOI - 10.1021/acsomega.8b03390
Subject(s) - chembl , computer science , retrosynthetic analysis , series (stratigraphy) , chemistry , information retrieval , combinatorial chemistry , identification (biology) , drug discovery , stereochemistry , total synthesis , paleontology , biochemistry , botany , biology
Chemical optimization of organic compounds produces a series of analogues. In addition to considering an analogue series (AS) or multiple series on a case-by-case basis, which is often done in the practice of chemistry, the extraction of analogues from compound repositories is of high interest in organic and medicinal chemistry. In organic chemistry, ASs are a source of alternative synthetic routes and also aid in exploring relationships between compounds from different sources including synthetic vs. naturally occurring molecules. In medicinal chemistry, ASs are the major source of structure-activity relationship information and of hits or leads for drug development. ASs might be identified in different ways. For a given reference compound, a substructure search can be carried out using its scaffold. Alternatively, matched molecular pairs can be calculated to retrieve analogues from a compound repository. However, if no query compounds are used, the identification of ASs in databases is a difficult task. Herein, we introduce a computational approach to systematically identify ASs in collections of organic compounds. The approach involves compound decomposition on the basis of well-established retrosynthetic rules, organization of compound-core relationships, and identification of analogues sharing the same core. The method was applied on a large scale to extract ASs from the ChEMBL database, yielding more than 30 000 distinct series.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom