Premium
Analysis of clinical flow cytometric immunophenotyping data by clustering on statistical manifolds: Treating flow cytometry data as high‐dimensional objects
Author(s) -
Finn William G.,
Carter Kevin M.,
Raich Raviv,
Stoolman Lloyd M.,
Hero Alfred O.
Publication year - 2009
Publication title -
cytometry part b: clinical cytometry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.646
H-Index - 61
eISSN - 1552-4957
pISSN - 1552-4949
DOI - 10.1002/cyto.b.20435
Subject(s) - kullback–leibler divergence , cluster analysis , computer science , pattern recognition (psychology) , context (archaeology) , flow cytometry , divergence (linguistics) , hierarchical clustering , information geometry , entropy (arrow of time) , parametric statistics , artificial intelligence , mathematics , data mining , statistics , physics , biology , geometry , paleontology , linguistics , philosophy , genetics , scalar curvature , curvature , quantum mechanics
Background Clinical flow cytometry typically involves the sequential interpretation of two‐dimensional histograms, usually culled from six or more cellular characteristics, following initial selection (gating) of cell populations based on a different subset of these characteristics. We examined the feasibility of instead treating gated n ‐parameter clinical flow cytometry data as objects embedded in n ‐dimensional space using principles of information geometry via a recently described method known as Fisher Information Non‐parametric Embedding (FINE). Methods After initial selection of relevant cell populations through an iterative gating strategy, we converted four color (six‐parameter) clinical flow cytometry datasets into six‐dimensional probability density functions, and calculated differences among these distributions using the Kullback‐Leibler divergence (a measurement of relative distributional entropy shown to be an appropriate approximation of Fisher information distance in certain types of statistical manifolds). Neighborhood maps based on Kullback‐Leibler divergences were projected onto two dimensional displays for comparison. Results These methods resulted in the effective unsupervised clustering of cases of acute lymphoblastic leukemia from cases of expansion of physiologic B‐cell precursors (hematogones) within a set of 54 patient samples. Conclusions The treatment of flow cytometry datasets as objects embedded in high‐dimensional space (as opposed to sequential two‐dimensional analyses) harbors the potential for use as a decision‐support tool in clinical practice or as a means for context‐based archiving and searching of clinical flow cytometry data based on high‐dimensional distribution patterns contained within stored list mode data. Additional studies will be needed to further test the effectiveness of this approach in clinical practice. © 2008 Clinical Cytometry Society