Linear Discriminative Star Coordinates for Exploring Class and Cluster Separation of High Dimensional Data | Zendy

Wang Yunhai | Zendy; Li Jingting | Zendy; Nie Feiping | Zendy; Theisel Holger | Zendy; Gong Minglun | Zendy; Lehmann Dirk J. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Linear Discriminative Star Coordinates for Exploring Class and Cluster Separation of High Dimensional Data

Author(s) -

Wang Yunhai,

Li Jingting,

Nie Feiping,

Theisel Holger,

Gong Minglun,

Lehmann Dirk J.

Publication year - 2017

Publication title -

computer graphics forum

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.578

H-Index - 120

eISSN - 1467-8659

pISSN - 0167-7055

DOI - 10.1111/cgf.13197

Subject(s) - discriminative model , computer science , linear subspace , projection (relational algebra) , cluster analysis , outlier , class (philosophy) , artificial intelligence , pattern recognition (psychology) , dimension (graph theory) , set (abstract data type) , data mining , benchmark (surveying) , data set , feature vector , algorithm , mathematics , geometry , geodesy , pure mathematics , programming language , geography

One main task for domain experts in analysing their nD data is to detect and interpret class/cluster separations and outliers. In fact, an important question is, which features/dimensions separate classes best or allow a cluster‐based data classification. Common approaches rely on projections from nD to 2D, which comes with some challenges, such as: The space of projection contains an infinite number of items. How to find the right one? The projection approaches suffers from distortions and misleading effects. How to rely to the projected class/cluster separation? The projections involve the complete set of dimensions/features. How to identify irrelevant dimensions? Thus, to address these challenges, we introduce a visual analytics concept for the feature selection based on linear discriminative star coordinates (DSC), which generate optimal cluster separating views in a linear sense for both labeled and unlabeled data. This way the user is able to explore how each dimension contributes to clustering. To support to explore relations between clusters and data dimensions, we provide a set of cluster‐aware interactions allowing to smartly iterate through subspaces of both records and features in a guided manner. We demonstrate our features selection approach for optimal cluster/class separation analysis with a couple of experiments on real‐life benchmark high‐dimensional data sets.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research