
Flexible Fitting of PROTAC Concentration–Response Curves with Changepoint Gaussian Processes
Author(s) -
Elizaveta Semenova,
Maria Luisa Guerriero,
Bairu Zhang,
A. Höck,
Philip Hopcroft,
Ganesh Kadamur,
Avid M. Afzal,
Stanley E. Lazic
Publication year - 2021
Publication title -
slas discovery
Language(s) - English
Resource type - Journals
eISSN - 2472-5560
pISSN - 2472-5552
DOI - 10.1177/24725552211028142
Subject(s) - sigmoid function , gaussian , computer science , reproducibility , ranking (information retrieval) , algorithm , statistics , biological system , mathematics , data mining , chemistry , artificial intelligence , computational chemistry , artificial neural network , biology
A proteolysis-targeting chimera (PROTAC) is a new technology that marks proteins for degradation in a highly specific manner. During screening, PROTAC compounds are tested in concentration-response (CR) assays to determine their potency, and parameters such as the half-maximal degradation concentration (DC 50 ) are estimated from the fitted CR curves. These parameters are used to rank compounds, with lower DC 50 values indicating greater potency. However, PROTAC data often exhibit biphasic and polyphasic relationships, making standard sigmoidal CR models inappropriate. A common solution includes manual omitting of points (the so-called masking step), allowing standard models to be used on the reduced data sets. Due to its manual and subjective nature, masking becomes a costly and nonreproducible procedure. We therefore used a Bayesian changepoint Gaussian processes model that can flexibly fit both nonsigmoidal and sigmoidal CR curves without user input. Parameters such as the DC 50 , maximum effect D max , and point of departure (PoD) are estimated from the fitted curves. We then rank compounds based on one or more parameters and propagate the parameter uncertainty into the rankings, enabling us to confidently state if one compound is better than another. Hence, we used a flexible and automated procedure for PROTAC screening experiments. By minimizing subjective decisions, our approach reduces time and cost and ensures reproducibility of the compound-ranking procedure. The code and data are provided on GitHub (https://github.com/elizavetasemenova/gp_concentration_response).