Efficient identification of side‐chain patterns using a multidimensional index tree | Zendy

Hamelryck Thomas | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Efficient identification of side‐chain patterns using a multidimensional index tree

Author(s) -

Hamelryck Thomas

Publication year - 2003

Publication title -

proteins: structure, function, and bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.699

H-Index - 191

eISSN - 1097-0134

pISSN - 0887-3585

DOI - 10.1002/prot.10338

Subject(s) - identification (biology) , similarity (geometry) , tree (set theory) , computational biology , pattern recognition (psychology) , rectangle , function (biology) , computer science , structural similarity , side chain , biological system , artificial intelligence , biology , mathematics , chemistry , image (mathematics) , combinatorics , evolutionary biology , botany , geometry , organic chemistry , polymer

Convergent evolution often produces similar functional sites in nonhomologous proteins. The identification of these sites can make it possible to infer function from structure, to pinpoint the location of a functional site, to identify enzymes with similar enzymatic mechanisms, or to discover putative functional sites. In this article, a novel method is presented that (a) queries a database of protein structures for the occurrence of a given side chain pattern and (b) identifies interesting side‐chain patterns in a given structure. For efficiency and to make a robust statistical evaluation of the significance of a similarity possible, patterns of three residues (or triads) are considered. Each triad is encoded as a high‐dimensional vector and stored in an SR (Sphere/Rectangle) tree, an efficient multidimensional index tree. Identifying similar triads can then be reformulated as identifying neighboring vectors. The method deals with many features that otherwise complicate the identification of meaningful patterns: shifted backbone positions, conservative substitutions, various atom label ambiguities and mirror imaged geometries. The combined treatment of these features leads to the identification of previously unidentified patterns. In particular, the identification of mirror imaged side‐chain patterns is unique to the here‐described method. Interesting triads in a given structure can be identified by extracting all triads and comparing them with a database of triads involved in ligand binding. The approach was tested by an all‐against‐all comparison of unique representatives of all SCOP superfamilies. New findings include mirror imaged metal binding and active sites, and a putative active site in bacterial luciferase. Proteins 2003;51:96–108. © 2003 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research