Copy number variations, a frequent and widespread feature of cancer genomes, can be studied in detail using microarray techniques. Initial studies have resulted in identification of prognostic biomarkers and cancer genes. In order to more fully exploit the discovery potential of multi-patient copy number data sets, we developed a procedure to delineate statistically important regions of recurrent copy number aberrations in cancer, starting from a complex pattern of overlapping copy number events. These regions, dubbed epicenters, permit a drastic compression of copy number data. We demonstrate that epicenters are enriched in genes and are likely to contain cancer genes in breast, lung and colon cancer. This is confirmed by direct observation for known important cancer genes. I will discuss applications of epicenters, such as using them as predictors for subtyping cancers and as markers of progression.
Back to Workshop IV: Search and Knowledge Building for Biological Datasets