Manifold Learning for the Sciences

Marina Meila
University of Washington
Statistics

Manifolds model wide classes of physical systems. This talk is concerned with fitting non-parametric manifold models to large scientific data sets, motivated by the need to relieve the scientist from the responsibility of parameter selection and model validation by visual inspection. Thus, I will present components of a coherent framework, that allows a user to semi-automatically select the neighborhood scale, compute an embedding of the data, estimate and correct its distortions, and find physical interpretations of the embedding coordinates.

The last task is solved by a novel method called ManifoldLasso, which selects from a dictionary of functions a subset that can
*non-linearly* parametrize the manifold.

The entire pipeline is implemented in megaman, an open source python package for scalable manifold learning.


Joint work with Dominique Perrault-Joncas, James McQueen, Jacob VanderPlas, Zhongyue Zhang, Grace Telford, Yu-chia Chen, Samson Koelle, Hanyu Zhang

Presentation (PDF File)

Back to Workshop III: Geometry of Big Data