Premium
TASSER: An automated method for the prediction of protein tertiary structures in CASP6
Author(s) -
Zhang Yang,
Arakaki Adrian K.,
Skolnick Jeffrey
Publication year - 2005
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.20724
Subject(s) - threading (protein sequence) , casp , protein structure prediction , monte carlo method , template , protein tertiary structure , computer science , cluster analysis , protein structure , pattern recognition (psychology) , crystallography , algorithm , artificial intelligence , mathematics , statistics , chemistry , programming language , biochemistry
The recently developed TASSER (Threading/ASSembly/Refinement) method is applied to predict the tertiary structures of all CASP6 targets. TASSER is a hierarchical approach that consists of template identification by the threading program PROSPECTOR_3, followed by tertiary structure assembly via rearranging continuous template fragments. Assembly occurs using parallel hyperbolic Monte Carlo sampling under the guide of an optimized, reduced force field that includes knowledge‐based statistical potentials and spatial restraints extracted from threading alignments. Models are automatically selected from the Monte Carlo trajectories in the low‐temperature replicas using the clustering program SPICKER. For all 90 CASP targets/domains, PROSPECTOR_3 generates initial alignments with an average root‐mean‐square deviation (RMSD) to native of 8.4 Å with 79% coverage. After TASSER reassembly, the average RMSD decreases to 5.4 Å over the same aligned residues; the overall cumulative TM‐score increases from 39.44 to 52.53. Despite significant improvements over the PROSPECTOR_3 template alignment observed in all target categories, the overall quality of the final models is essentially dictated by the quality of threading templates: The average TM‐scores of TASSER models in the three categories are, respectively, 0.79 [comparative modeling (CM), 43 targets/domains], 0.47 [fold recognition (FR), 37 targets/domains], and 0.30 [new fold (NF), 10 targets/domains]. This highlights the need to develop novel (or improved) approaches to identify very distant targets as well as better NF algorithms. Proteins 2005;Suppl 7:91–98. © 2005 Wiley‐Liss, Inc.