The talk will describe how science is changing as a result of the vast amounts of data we are collecting from gene sequencers to telescopes and supercomputers. This “Fourth Paradigm of Science”, predicted by Jim Gray, is moving at full speed, and is transforming one scientific area after another. The talk will present various examples on the similarities of the emerging new challenges and how this vision is realized by the scientific community. Scientists are increasingly limited by their ability to analyze the large amounts of complex data available. These data sets are generated not only by instruments but also computational experiments; the sizes of the largest numerical simulations are on par with data collected by instruments, crossing the petabyte threshold. The importance of large synthetic data sets is increasingly important, as scientists compare their experiments to reference simulations. All disciplines need a new “instrument for data” that can deal not only with large data sets but the cross product of large and diverse data sets. There are several multi-faceted challenges related to this conversion, e.g. how to move, visualize, analyze and in general interact with Petabytes of data. The talk will outline how all these will have to change as we move towards Exascale computers.
Back to Workshop I: Big Data Meets Large-Scale Computing