Premium
Computational Approaches for the Analysis of Tandem Affinity and Proximity‐Based Purifications in R
Author(s) -
Velasquez Erick Francisco,
Garcia Yenni,
Gao Lucy,
Whitelegge Julian,
Torres Jorge
Publication year - 2019
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.2019.33.1_supplement.473.8
Subject(s) - false positive paradox , computer science , pipeline (software) , exploit , interactome , false positives and false negatives , visualization , data mining , data science , true positive rate , information retrieval , machine learning , artificial intelligence , programming language , chemistry , biochemistry , computer security , gene
During the past decade, the incorporation of mass spectrometry data to identify protein‐protein interactions and associations has become extremely popular, and presents an exciting opportunity to exploit the protein interactome. However, the experimentation often produces large amounts of proteomic data that can be hard to handle, delaying scientific discovery. It has become clear that when handling the data, cleaning the data of false positives is as important as verifying associations. To address this issue, scientists have developed databases like the Contaminant Repository for Affinity Purification (CRAPome) that provide spectral count data for a collection of controls. However, the data can be hard to interpret without a more rigorous statistical analysis. This work describes an open source proteomic pipeline that incorporates tools to identify false positives and visualize interactions/associations in a user friendly manner. Taking advantage of R, a popular statistical programming language, and Shinny apps we developed a program that aims to standardize analysis of protein interaction/association experiments. The pipeline allows scientists to submit proteomic data and easily apply suggested significance tests to clean their experiments against their own control and the CRAPome if they wish to do so. Often these types of experiments are done through a purification step and so we have extended the platform to include predictive machine learning algorithms that highlight false positives based on intrinsic data that describes a protein that might stick to the column. After analysis, is carried out we couple the platform to protein interaction network visualization tools such as Cytoscape, to create a comprehensive analysis pipeline that is user friendly. In the future we plan on extending this platform by applying neuronal networks to further predict protein‐protein interactions, adding to a fully automated pipeline, for the analysis of protein‐interaction/association experiments. Support or Funding Information The work was possible through USPHS National Research Service Award 5T32GM008496 and the Whitcome Pre‐doctoral Training Program and the UCLA MolecularBiology Institute This abstract is from the Experimental Biology 2019 Meeting. There is no full text article associated with this abstract published in The FASEB Journal .