z-logo
Premium
Strategies for the effective identification of remotely related sequences in multiple PSSM search approach
Author(s) -
Gowri V.S.,
Tina K.G.,
Krishnadev O.,
Srinivasan N.
Publication year - 2007
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.21356
Subject(s) - false positive paradox , computer science , sequence (biology) , identification (biology) , matching (statistics) , sequence database , reference database , false positives and false negatives , data mining , artificial intelligence , pattern recognition (psychology) , mathematics , database , genetics , biology , statistics , botany , gene
Searches using position specific scoring matrices (PSSMs) have been commonly used in remote homology detection procedures such as PSI‐BLAST and RPS‐BLAST. A PSSM is generated typically using one of the sequences of a family as the reference sequence. In the case of PSI‐BLAST searches the reference sequence is same as the query. Recently we have shown that searches against the database of multiple family‐profiles, with each one of the members of the family used as a reference sequence, are more effective than searches against the classical database of single family‐profiles. Despite relatively a better overall performance when compared with common sequence‐profile matching procedures, searches against the multiple family‐profiles database result in a few false positives and false negatives. Here we show that profile length and divergence of sequences used in the construction of a PSSM have major influence on the performance of multiple profile based search approach. We also identify that a simple parameter defined by the number of PSSMs corresponding to a family that is hit, for a query, divided by the total number of PSSMs in the family can distinguish effectively the true positives from the false positives in the multiple profiles search approach. Proteins 2007. © 2007 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here