
Fast, axis-agnostic, dynamically summarized storage and retrieval for mass spectrometry data
Author(s) -
Kyle Handy,
Jeb Rosen,
André Gillan,
Rob Smith
Publication year - 2017
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0188059
Subject(s) - computer science , automatic summarization , information retrieval , computer data storage , data retrieval , mass spectrometry , data mining , database , computer hardware , chemistry , chromatography
Mass spectrometry, a popular technique for elucidating the molecular contents of experimental samples, creates data sets comprised of millions of three-dimensional (m/z, retention time, intensity) data points that correspond to the types and quantities of analyzed molecules. Open and commercial MS data formats are arranged by retention time, creating latency when accessing data across multiple m/z. Existing MS storage and retrieval methods have been developed to overcome the limitations of retention time-based data formats, but do not provide certain features such as dynamic summarization and storage and retrieval of point meta-data (such as signal cluster membership), precluding efficient viewing applications and certain data-processing approaches. This manuscript describes MzTree, a spatial database designed to provide real-time storage and retrieval of dynamically summarized standard and augmented MS data with fast performance in both m/z and RT directions. Performance is reported on real data with comparisons against related published retrieval systems.