Practical Model Selection for Prospective Virtual Screening
Author(s) -
Shengchao Liu,
Moayad Alnammi,
Spencer S. Ericksen,
Andrew F. Voter,
Gene E. Ananiev,
James L. Keck,
F. Michael Hoffmann,
Scott A. Wildman,
Anthony Gitter
Publication year - 2018
Publication title -
journal of chemical information and modeling
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.24
H-Index - 160
eISSN - 1549-960X
pISSN - 1549-9596
DOI - 10.1021/acs.jcim.8b00363
Subject(s) - virtual screening , computer science , prioritization , random forest , workflow , machine learning , artificial neural network , selection (genetic algorithm) , artificial intelligence , set (abstract data type) , task (project management) , high throughput screening , data mining , drug discovery , bioinformatics , database , biology , engineering , management science , programming language , systems engineering
Virtual (computational) high-throughput screening provides a strategy for prioritizing compounds for experimental screens, but the choice of virtual screening algorithm depends on the data set and evaluation strategy. We consider a wide range of ligand-based machine learning and docking-based approaches for virtual screening on two protein-protein interactions, PriA-SSB and RMI-FANCM, and present a strategy for choosing which algorithm is best for prospective compound prioritization. Our workflow identifies a random forest as the best algorithm for these targets over more sophisticated neural network-based models. The top 250 predictions from our selected random forest recover 37 of the 54 active compounds from a library of 22,434 new molecules assayed on PriA-SSB. We show that virtual screening methods that perform well on public data sets and synthetic benchmarks, like multi-task neural networks, may not always translate to prospective screening performance on a specific assay of interest.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom