We explore various applications that are made possible by the existence of scalable clustering algorithms. We compare deterministic and non-deterministic clusterers, and then explore how these can be used to advantage in the training of support vector machines, in handling data sets too big to fit in memory, in fast approximate latent semantic analysis, and in streaming data applications.
Audio (MP3 File, Podcast Ready)