z-logo
Premium
PseqIP: A nonredundant and exhaustive protein sequence data bank generated from 4 major existing collections
Author(s) -
Claverie Jean Michel,
Bricault Laurence
Publication year - 1986
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.340010110
Subject(s) - ascii , data bank , protein data bank , computer science , sequence (biology) , protein sequencing , data file , sequence database , data mining , computation , sequence alignment , algorithm , sequence analysis , peptide sequence , protein structure , biology , database , genetics , programming language , telecommunications , biochemistry , gene
Four major protein sequence data collections (NBRF‐PIR, PSD‐Kyoto, PGtrans, and NEWAT) have been merged into a single nonredundant data bank called PseqIP. The data bank entries were automatically matched by a heuristic computer program relying on the fast computation of the number of tetrapeptides shared by two sequences. PseqIP 1.0 includes 6,068 different protein sequences for a total of 1,357,067 residues, representing most of the available sequence information to date. During the course of this work, we found about 600 occurrences course of a protein sequence recorded with a one‐amino‐acid variation in at least two different data banks. A flat file (ASCII computer‐readable format) version of PseqIP 1.0, well‐suited for exhaustive homology searches and statistical sequence analysis, is available from our laboratory.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here