The theory of assimilation is now well established on the basis of Bayesian estimation. As such, assimilation requires the a priori
specification, either explicit or implicit, of the probability distributions of the errors affecting the various data to be assimilated
(i. e., observations proper, and assimilating model). Most present assimilation algorithms can be described as particular applications of
Best Linear Unbiased Estimation, or BLUE, and require as such the specification of only the first- and second-order statistical moments of
the errors (i. e., expectations and covariances). Even nonlinear algorithms, such as ensemble sequential assimilation, most often use only
expectations and covariances of data errors.
Objective assessment of the quality of an assimilation algorithm can only
be performed by comparaison against unbiased and statistically independent data. However, the unbiasedness and the independence have to be assumed, and cannot be objectively verified.
Another type of validation can bear on the specification of the error statistics. Within assimilation itself, the only objective source of
information on the data error lies in the overdeterminacy of the data, i. e. in the innovation vector. The only possible objective check of whether the error statistics have been properly specified is therefore comparison
of the a priori specified, and a posteriori observed statistics of the innovation. That can be done on the innovation itself, or equivalently on the difference between the assimilated fields and the data that have been used in the assimilation. However, the number of degrees of freedom required for defining the error statistics is larger than the number of
parameters required for actually performing the assimilation, with the consequence that the problem of estimating the error statistics from the innovation statististics is totally undetermined. Independent hypotheses,
which cannot be validated within the assimilation procedure itself, are always necessary. A number of diagnostics, which bear directly or
indirectly on the innovation, are presented and discussed. They include in particular the classical so-called c2-test. The link of those diagnostics with the informative content of the various of data is discussed, as well as a number of recent applications.
In the linear case, an objective test of the optimality of an assimilation system can be based on the fact that optimality is equivalent to
decorrelation between the innovation and the estimation error. That can be checked against independent data. Relatively few works have been performed so far along that line. These are presented and discussed.