Manifold learning and unwrapping using density ridges
Author(s) -
Shaker
Publication year - 2016
Language(s) - English
Resource type - Dissertations/theses
DOI - 10.17760/d20260369
Subject(s) - nonlinear dimensionality reduction , curvilinear coordinates , manifold alignment , manifold (fluid mechanics) , statistical manifold , cluster analysis , dimensionality reduction , ridge , euclidean space , pseudo riemannian manifold , curse of dimensionality , computer science , mathematics , artificial intelligence , geography , geometry , mathematical analysis , information geometry , curvature , ricci curvature , cartography , mechanical engineering , scalar curvature , engineering
Manifold Learning and Unwrapping Using Density Ridges by Matineh Shaker Doctor of Philosophy in Electrical Engineering Northeastern University, September 2016 Dr. Deniz Erdogmus, Adviser Manifold learning is used for determining a coordinate system for high dimensional data on its intrinsic low-dimensional manifold, in order to (approximately) unwrap the manifold to an isomorphic Euclidean space, or to reduce data dimensionality. Defining a global or local manifold on data also provides a base for data classification, clustering, and visualization. Research on manifold learning within a density ridge estimation framework has shown great potential in recent work for both estimation and denoising of manifolds, building on the intuitive and well-defined notion of principal curves and surfaces. However, the problem of unwrapping or unfolding manifolds has received relatively little attention within the density ridge approach, despite being an integral part of manifold learning in general. We seek natural curvilinear coordinate systems along the manifold, coordinate frames that are geometrically meaningful based on the data distribution. In this dissertation, two approaches are presented towards this end. In the first approach, we define the ridges of the (at least twice continuously differentiable) probability density function explicitly based on its gradient and Hessian, and in practice, demonstrate results using a kernel density estimate approximation based on samples drawn fromthis density. Specifically, a given point is projected to the ridges of the probability density by solving differential equations that follow ascent trajectories whose tangent vectors are gradient vector projected to subspaces spanned by all eigenvectors of the Hessian except one (the one that will become the tangent vector of the ridge to which the point is projected). Once the original point is projected to each ridge by designating each eigenvector as the ”tangent” once, the xii ridge projections are used as initial conditions to gradient-ascent differential equations. The curvelength of these latter trajectories, combined with information regarding direction of arrival to the mode, are used to construct a curvilinear coordinate vector for the original point with respect to the mode it climbs to in this process. A natural consequence of this process is that each point gets a curvilinear coordinate relative to the mode that contains the original point in its attraction basin (according to gradient-ascent), hence mode-based clusters are naturally endowed with local charts with curvilinear coordinate systems. We provide experimental results with both synthetic and real data sets, such as MNIST handwritten digits and Frey Faces. Manifold traversing for images of digits gives an approximation on the underlying smooth manifold representing geometric variations like orientation and thickness of digits in the two leading unwrapped dimensions. Results on synthetic datasets, such as crescent and two-moon, showed that the proposed curvilinear coordinates capture the underlying manifold of a mode more successfully compared to benchmark data-oriented methods available in the literature, illustrating our assertion that the ridges of the distribution and the local maximum point at the intersection of these curves, form a natural geometric skeleton for the curvilinear structure of the data distribution in the vicinity of this mode. The second approach aims to avoid solving differential equations that involve eigendecomposition of Hessians along every point in the associated trajectories. Instead, we recognize that an alternative characterization of critical surfaces and curves (which include ridge-, cliffside-, and valley-paths) is possible using the rank of a matrix sequence that has i nits columns powers of the Hessian multiplying the gradient at a point. The so-called Omega matrix sequence, then as a set of zero level-sets for their respective determinants points to d-dimensional critical sets. Approximating the determinant with a polynomial expansion yields a polynomial curvilinear coordinate system based on the factorization of the matrix determinant. For an exponential density model with polynomial bases in the exponent, the Omega matrix entries and its determinant become polynomials, in which case, the manifold unwrapping problem is reduced to one of polynomial factorization. We provide unwrapping results based on known and estimated probability density functions in the exponential family with polynomial bases. Experiments on synthetic and real data sets show that this procedure generates similar results to the ridge tracing method, but with polynomial warping of coordinates, as expected.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom