In many data analysis applications the following scenario is commonplace: we are given a point set that is supposed to sample a hidden ground truth domain K in a metric space (say a surface from R^3), but it got corrupted with noise so that some of the data points lie far away from K creating outliers and ambient noise. One of the main goals of denoising algorithms is to eliminate such noise so that the curated data lie within a bounded Hausdorff distance of the hidden domain. Common denoising techniques often require the user setting several parameters and/or choosing an appropriate noise model, and theoretical guarantees are not always obtained. In this talk, I will describe our recent progress in studying the denoising problem. Our goal is to understand what parameters are necessary in a denoising process, and to lighten the burden (of choosing parameters) as much as possible while ensuring the theoretical guarantees. First, I will show that there exists an algorithm requiring only a single parameter under a sampling condition that is not any more restrictive than the known prevailing models. Under such sampling conditions, this parameter cannot be avoided. I will next describe a simple algorithm that avoids even this parameter by paying for it with a slight strengthening of the sampling condition which is not unrealistic. I will also provide some empirical evidence that our algorithms are effective in practice.
This is joint work with Mickael Buchet, Tamal Dey and Jiayuan Wang.