Mass spectrometry is an invaluable tool for proteomic research because mass spectra are rich in information with thousands of mass measurements furnished from a single sample. The richness and complexity of mass spectral information has lead to the use of a wide variety of data-mining tools. However, this abundance of information is accompanied with a curse of dimensionality. This curse refers to the likelihood of random correlations with the experimental hypothesis that increases with the number of mass measurements. In addition, hidden correlations from differences in the isolation and purification of biological samples may be present in the mass spectra as well.
This presentation will focus on validation of the analysis of complex samples focused on mass spectrometry as the measurement method. Principles of experimental design will be introduced. The concept of maintaining a mechanism of inference will follow. A mechanism of inference allows the scientist to elucidate the relationships between the experimental hypothesis (e.g., differentiating disease cells from controls) and the measurement (e.g., the difference in ion abundance of a biomarker protein) so that causal relationships can be established. Tools exist for probing complex (e.g., neural networks) and proprietary software packages to determine their mechanism of inference. Because samples from biological procedures may be costly and difficult to obtain, bootstrapping methods will be discussed that furnish bounds on results that span sample-handling and instrumental variations.