Premium
Implementation and application of a versatile clustering tool for tandem mass spectrometry data
Author(s) -
Flikka Kristian,
Meukens Jeroen,
Helsens Kenny,
Vandekerckhove Joël,
Eidhammer Ingvar,
Gevaert Kris,
Martens Lennart
Publication year - 2007
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.200700160
Subject(s) - cluster analysis , redundancy (engineering) , computer science , fragmentation (computing) , mass spectrum , data mining , tandem mass spectrometry , mass spectrometry , software , algorithm , chemistry , artificial intelligence , chromatography , programming language , operating system
High‐throughput proteomics experiments typically generate large amounts of peptide fragmentation mass spectra during a single experiment. There is often a substantial amount of redundant fragmentation of the same precursors among these spectra, which is usually considered a nuisance. We here discuss the potential of clustering and merging redundant spectra to turn this redundancy into a useful property of the dataset. To this end, we have created the first general‐purpose, freely available open‐source software application for clustering and merging MS/MS spectra. The application also introduces a novel approach to calculating the similarity of fragmentation mass spectra that takes into account the increased precision of modern mass spectrometers, and we suggest a simple but effective improvement to single‐linkage clustering. The application and the novel algorithms are applied to several real‐life proteomic datasets and the results are discussed. An analysis of the influence of the different algorithms available and their parameters is given, as well as a number of important applications of the overall approach.