z-logo
Premium
Use of cluster separation indices and the influence of outliers: application of two new separation indices, the modified silhouette index and the overlap coefficient to simulated data and mouse urine metabolomic profiles
Author(s) -
Dixon Sarah J.,
Heinrich Nina,
Holmboe Maria,
Schaefer Michele L.,
Reed Randall R.,
Trevejo Jose,
Brereton Richard G.
Publication year - 2009
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1189
Subject(s) - silhouette , outlier , robustness (evolution) , pattern recognition (psychology) , artificial intelligence , chemistry , computer science , chromatography , mathematics , biochemistry , gene
To quantify separate classes, four indices are compared namely the Davies Bouldin index, the silhouette width and two new approaches described in this paper, the modified silhouette width index based on the proportion of objects with a positive silhouette width and the Overlap Coefficient. Four sets of simulated datasets are described, each in turn, consisting of 15 sets of data of varying degrees of overlap, and differing in the nature of outliers. Three experimental datasets consisting of the gas chromatography mass spectrometry of extracts from mouse urine obtained to study the effect of different environmental (stress), physiological (diet) and developmental (age) factors on their metabolic profiles are also described. The paper discusses the robustness of each approach to outliers, and to allow assessment of class separation for each index. The two modifications protect against outliers. Copyright © 2008 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here