Protein Family-Specific Models Using Deep Neural Networks and Transfer Learning Improve Virtual Screening and Highlight the Need for More Data
Author(s) -
Fergus Imrie,
A.R. Bradley,
Mihaela van der Schaar,
Charlotte M. Deane
Publication year - 2018
Publication title -
journal of chemical information and modeling
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.24
H-Index - 160
eISSN - 1549-960X
pISSN - 1549-9596
DOI - 10.1021/acs.jcim.8b00350
Subject(s) - chembl , computer science , virtual screening , convolutional neural network , machine learning , artificial intelligence , transfer of learning , benchmark (surveying) , deep learning , set (abstract data type) , data set , test set , artificial neural network , data mining , drug discovery , bioinformatics , geodesy , geography , biology , programming language
Machine learning has shown enormous potential for computer-aided drug discovery. Here we show how modern convolutional neural networks (CNNs) can be applied to structure-based virtual screening. We have coupled our densely connected CNN (DenseNet) with a transfer learning approach which we use to produce an ensemble of protein family-specific models. We conduct an in-depth empirical study and provide the first guidelines on the minimum requirements for adopting a protein family-specific model. Our method also highlights the need for additional data, even in data-rich protein families. Our approach outperforms recent benchmarks on the DUD-E data set and an independent test set constructed from the ChEMBL database. Using a clustered cross-validation on DUD-E, we achieve an average AUC ROC of 0.92 and a 0.5% ROC enrichment factor of 79. This represents an improvement in early enrichment of over 75% compared to a recent machine learning benchmark. Our results demonstrate that the continued improvements in machine learning architecture for computer vision apply to structure-based virtual screening.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom