Premium
Interrogating Fractionation and Other Sources of Variability in Shotgun Proteomes Using Quality Metrics
Author(s) -
Kriek Marina,
Monyai Koena,
Magcwebeba Tandeka U.,
Du Plessis Nelita,
Stoychev Stoyan H.,
Tabb David L.
Publication year - 2020
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.201900382
Subject(s) - shotgun proteomics , shotgun , computer science , data mining , outlier , proteome , computational biology , proteomics , biology , bioinformatics , artificial intelligence , gene , biochemistry
The increasing amount of publicly available proteomics data creates opportunities for data scientists to investigate quality metrics in novel ways. QuaMeter IDFree is used to generate quality metrics from 665 RAW files and 97 WIFF files representing publicly available “shotgun” mass spectrometry datasets. These experiments are selected to represent Mycobacterium tuberculosis lysates, mouse MDSCs, and exosomes derived from human cell lines. Machine learning techniques are demonstrated to detect outliers within experiments and it is shown that quality metrics may be used to distinguish sources of variability among these experiments. In particular, the findings demonstrate that according to nested ANOVA performed on an SDS‐PAGE shotgun principal component analysis, runs of fractions from the same gel regions cluster together rather than technical replicates, close temporal proximity, or even biological samples. This indicates that the individual fraction may have had a higher impact on the quality metrics than other factors. In addition, sample type, instrument type, mass analyzer, fragmentation technique, and digestion enzyme are identified as sources of variability. From a quality control perspective, the importance of study design and in particular, the run order, is illustrated in seeking ways to limit the impact of technical variability.