Premium
Functional assignment of Structural Genomics proteins through computed chemical properties, graph representation of active sites, and biochemical validation
Author(s) -
Mills Caitlyn L,
Garg Rohan,
Lee Joslynn S,
Parasuram Ramya,
Tian Liang,
Suciu Alexandru,
Cooperman Gene,
Beuning Penny,
Ondrechen Mary Jo
Publication year - 2018
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.2018.32.1_supplement.lb94
Subject(s) - structural genomics , protein data bank (rcsb pdb) , computational biology , protein structure , protein function prediction , graph , functional genomics , computer science , genomics , biology , protein function , biochemistry , genome , gene , theoretical computer science
There are currently over 14,300 Structural Genomics (SG) protein structures deposited in the PDB by protein structure initiatives. However, most of these SG proteins have unknown or putative function annotations. This accumulated structural information represents a tremendous contribution to structural biology and genomics. Still, the addition of accurate functional annotations for these SG proteins would add substantial value to this information. Our approach to functional annotation and validation incorporates predicting functional assignments through structure‐based computed chemical properties and local structure matching followed by biochemical validation. This research focuses on four superfamilies: Crotonase, Ribulose Phosphate Binding Barrel, 6‐Hairpin Glycosidase, and Concanavlin A‐like Lectins and Glucanases. First, Partial Order Optimum Likelihood (POOL) is used to predict computationally the catalytically important residues in each protein structure. Next, Structurally Aligned Local Sites of Activity (SALSA) develops spatially‐localized consensus signatures for the proteins of known function in each functional family within each superfamily based on POOL‐predicted residues and functionally characterized residues of importance. Then, the POOL‐predicted residues for each SG protein are compared to each consensus signature and scored to determine their degree of similarity at the local active site. Finally, we introduce a new, computationally faster method for sorting protein superfamilies and annotating protein function using local structure matching in graph representation: Graph Representation of Active Sites for Prediction of Function (GRASP‐Func). Sets of tetrahedra are generated through Delaunay triangulation for each protein structure using the alpha carbon atoms of each residue. Then, sets of proteins with matched tetrahedra are grouped together and images are generated showing the relationship of each protein (node) and its neighbors (edges) with similar active sites. We compare SALSA and GRASP‐Func and show that both methods correctly sort the superfamilies into their respective functional families. Both methods also make similar functional predictions for the SG proteins, with GRASP‐Func performing in far less time. Thus GRASP‐Func enables large‐scale comparisons and functional assignments within and across superfamilies. Finally, we are able to test these predictions biochemically to confirm function. Biochemical data for the Crotonase Superfamily show that while proteins have some promiscuous functionality, our methods predict the correct dominant function for each protein tested. The goal of this project is to provide a validated approach to functional annotation to enable applications from drug target identification to green chemistry and biofuel production. Support or Funding Information Support from NSF‐CHE‐1305655, NSF‐MCB‐1158176, NSF‐MCB‐1517290, PhRMA Foundations (Predoctoral Fellowship in Informatics awarded to CLM), NSF‐GRFP (JSL), MathWorks, Inc., and American Cancer Society Research Scholar Grant RSG‐12‐161‐01‐DMC (PJB). This abstract is from the Experimental Biology 2018 Meeting. There is no full text article associated with this abstract published in The FASEB Journal .