Joint work with Carrie Grimes, Google
Isometry:
http://www.stat.stanford.edu/~donoho/Reports/2002/IM01-0916.pdf
Recently, the Isomap procedure was proposed in a high-profile article
in
Science
as a new way to recover a low-dimensional parametrization of
data lying on a low-dimensional submanifold in
high-dimensional space. The method assumes that the submanifold,
viewed as a Riemannian submanifold of the ambient
high-dimensional space, is isometric to a convex subset of
Euclidean space. This naturally raises the question:
what datasets can reasonably be modeled by this condition?
In this paper, we consider a special kind of
image data: families of images generated by articulation of one or
several objects in a scene -- for example, images
of a black disk on a white background with center placed at a range of
locations.
The collection of all images in such an articulation family,
as the parameters of the articulation vary, makes up an
articulation manifold, a submanifold of $L^2$. We study the properties of
such articulation manifolds,
in particular, their lack of differentiability when the images have edges.
Under
these conditions, we show that there exists a natural
renormalization of geodesic distance which yields
a well-defined metric. We exhibit a list of
articulation models where the corresponding manifold equipped with
this new metric is indeed isometric to a convex
subset of Euclidean space. Examples include translations of a symmetric
object,
rotations of a closed set, articulations of a horizon, and expressions of
a cartoon
face.
The theoretical predictions from our study are borne out by empirical
experiments with published Isomap code. We also note that in the case
where several components of the image articulate independently,
isometry may fail; for example, with several disks in an image avoiding
contact, the underlying Riemannian manifold is locally isometric to an
open, connected, but not convex subset of
Euclidean space. Such a situation matches the assumptions of our
recently-proposed Hessian Eigenmaps procedure,
but not the original Isomap procedure.
Local Isometry:
http://www-stat.stanford.edu/~donoho/Reports/2003/HessianEigenmaps.pdf
In a recent article in {\it Science},
Roweis and Saul considered
the problem of recovering the parametrization
of an embedded surface in high-dimensional space,
given scattered data on the surface. They
proposed a methodology, Local Linear
Embedding (LLE), to attack this problem,
based on eigenanalysis of a certain quadratic
form associated with the dataset.
We describe a fundamental problem
with the whole idea, which is essentially
that of deriving embeddings
from eigenproblems based on the laplacian.
We then consider a new variational
problem based on the use of the
(Frobenius norm of the) Hessian
rather than Laplacian.
Perhaps surprisingly, we see that
Hessian LLE is able to solve the same class
of problems as another methodology
introduced in the same issue
of {\it Science}, namely ISOMAP.
Thus we are able to establish a novel
connection between two previously
independent lines of research.
Moreover, while ISOMAP requires
convexity of the underlying parameter
space, Hessian LLE does not; hence
Hessian LLE significantly expands the range of manifolds
for which we can recover isometric parametrizations.
We show how to translate this theoretical
work into an empirical algorithm able to solve
significantly larger-scale problems
than can be solved by ISOMAP.