Feature selection methods

Isabelle Guyon
Clopinet

Variable and feature selection have become the focus of much research in areas of
application for which datasets with tens or hundreds of thousands of variables are
available. These areas include text processing of internet documents, gene expression
array analysis, and combinatorial chemistry. The objective of variable selection is
three-fold: improving the prediction performance of the predictors, providing faster and
more cost-effective predictors, and providing a better understanding of the underlying
process that generated the data. This tutorial will cover a wide range of aspects of such
problems: providing a better definition of the objective function, feature construction,
feature ranking, multivariate feature selection, efficient search methods, and feature
validity assessment methods.
Most feature selection methods do not attempt to uncover causal relationships between
feature and target and focus instead on making best predictions. We will examine
situations in which the knowledge of causal relationships benefits feature selection.
Such benefits may include: explaining relevance in terms of causal mechanisms,
distinguishing between actual features and experimental artifacts, predicting the
consequences of actions performed by external agents, and making predictions in
non-stationary environments.

Presentation (PowerPoint File)

Back to Summer School: Mathematics in Brain Imaging