Premium
The enumeration of chemical space
Author(s) -
Reymond JeanLouis,
Ruddigkeit Lars,
Blum Lorenz,
van Deursen Ruud
Publication year - 2012
Publication title -
wiley interdisciplinary reviews: computational molecular science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 5.126
H-Index - 81
eISSN - 1759-0884
pISSN - 1759-0876
DOI - 10.1002/wcms.1104
Subject(s) - cheminformatics , chemical space , enumeration , virtual screening , computer science , principal component analysis , field (mathematics) , molecular descriptor , projection (relational algebra) , space (punctuation) , quantum chemical , protein data bank , molecule , theoretical computer science , chemistry , data mining , mathematics , computational chemistry , algorithm , quantitative structure–activity relationship , artificial intelligence , bioinformatics , drug discovery , combinatorics , machine learning , molecular dynamics , biology , protein structure , pure mathematics , operating system , biochemistry , organic chemistry
Abstract In the field of medicinal chemistry, the chemical space describes the ensemble of all organic molecules to be considered when searching for new drugs (estimated >10 60 molecules), as well as the property spaces in which these molecules are placed for the sake of describing them. Molecules can be enumerated computationally by the millions, which was first undertaken in the field of computer‐aided structure elucidation. Scoring the enumerated virtual libraries by virtual screening has recently become an attractive strategy to prioritize compounds for synthesis and testing. Enumeration methods include combinatorial linking of fragments, genetic algorithms based on cycles of enumeration and selection by ligand‐based or target‐based scoring functions, and exhaustive enumeration from first principles. The chemical space of molecules following simple rules of chemical stability and synthetic feasibility has been enumerated up to 13 atoms of C, N, O, Cl, S, forming the GDB‐13 database with 977 million structures. The database has been organized in a 42‐dimensional chemical space using molecular quantum numbers (MQN) as descriptors, which can be visualized by projection in two dimensions by principal component analysis, and searched within seconds using a Web browser available at www.gdb.unibe.ch. © 2012 John Wiley & Sons, Ltd. This article is categorized under: Computer and Information Science > Chemoinformatics