
Timepoint Selection Strategy for In Vivo Proteome Dynamics from Heavy Water Metabolic Labeling and LC–MS
Author(s) -
Vugar R. Sadygov,
William Zhang,
Rovshan G. Sadygov
Publication year - 2020
Publication title -
journal of proteome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.644
H-Index - 161
eISSN - 1535-3907
pISSN - 1535-3893
DOI - 10.1021/acs.jproteome.0c00023
Subject(s) - proteome , proteomics , computational biology , monoisotopic mass , sampling (signal processing) , mean squared error , mass spectrometry , chemistry , biological system , biology , chromatography , statistics , computer science , mathematics , bioinformatics , biochemistry , filter (signal processing) , computer vision , gene
Protein homeostasis, proteostasis, is essential for healthy cell functioning and is dysregulated in many diseases. Metabolic labeling with heavy water followed by liquid chromatography coupled online to mass spectrometry (LC-MS) is a powerful high-throughput technique to study proteome dynamics in vivo. Longer labeling duration and dense timepoint sampling (TPS) of tissues provide accurate proteome dynamics estimations. However, the experiments are expensive, and they require animal housing and care, as well as labeling with stable isotopes. Often, the animals are sacrificed at selected timepoints to collect tissues. Therefore, it is necessary to optimize TPS for a given number of sampling points and labeling duration and target a specific tissue of study. Currently, such techniques are missing in proteomics. Here, we report on a formula-based stochastic simulation strategy for TPS for in vivo studies with heavy water metabolic labeling and LC-MS. We model the rate constant (lognormal), measurement error (Laplace), peptide length (gamma), relative abundance of the monoisotopic peak (beta regression), and the number of exchangeable hydrogens (gamma regression). The parameters of the distributions are determined using the corresponding empirical probability density functions from a large-scale dataset of murine heart proteome. The models are used in the simulations of the rate constant to minimize the root-mean-square error (rmse). The rmse for different TPSs shows structured patterns. They are analyzed to elucidate common features in the patterns.