The CDF is a data processing method (Ceccherini et al., 2015) [RD1] that allows to combine several independent measurements of an atmospheric vertical profile retrieved with the optimal estimation method (Rodgers, 2000) [RD7]. We suppose to have N independent simultaneous measurements of the vertical profile of an atmospheric parameter referred to a specific geolocation. Performing the retrieval of the N measurements, we obtain N vectors [katex]\hat{x}_i \; (i = 1, 2, \dots, N)[/katex] that provide independent estimates of the profile on a generic vertical grid. The vectors [katex]\hat{x}_i[/katex] are characterized by the covariance matrices (CMs) of the noise errors [katex]S_{ni}[/katex] and by the averaging kernel matrices (AKMs) [katex]A_i[/katex].
The CDF solution is obtained minimizing the following cost function: [katex][/katex]
[katex] c(\mathbf{x}) = \sum_{i=1}^{N} (\mathbf{a}_i - \mathbf{A}_i \mathbf{x})^{T} \mathbf{S}_{ni}^{-1} (\mathbf{a}_i - \mathbf{A}_i \mathbf{x}) + (\mathbf{x} - \mathbf{x}_a)^{T} \mathbf{S}_{a}^{-1} (\mathbf{x} - \mathbf{x}_a) [/katex]
where [katex]x_a[/katex] and [katex]S_a[/katex] are respectively the a priori profile and CM used to constrain the fused profile and
[katex]a_i = \hat{x}_i - x_{ai} + A_i x_{ai}[/katex]

CDF in perfect coincidence and on a common vertical grid

The original CDF solution (Ceccherini et al., 2015) is given by:
[katex]x_f = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} a_i + S_a^{-1} x_a \right)[/katex]
and is characterized by the AKM and by the CMs of the noise errors [katex]S_{nf}[/katex], of the smoothing errors [katex]S_{sf}[/katex] and of the total error [katex]S_{f}[/katex], given by:
[katex]A_f = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i[/katex]
[katex]S_{nf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]
[katex]S_{sf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1} S_a^{-1} \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]
[katex]S_f = S_{nf} + S_{sf} = \left( \sum_{i=1}^{N} A_i^T S_{ni}^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]
The previous formulas contain the inverse matrices of the CMs of the noise errors [katex]S_{ni}[/katex], which often are singular matrices, therefore, in such cases we have to replace the inverse matrices of [katex]S_{ni}[/katex] with the generalized inverse matrices (Kalman, 1976) [RD6] of [katex]S_{ni}[/katex]. The use of the generalized inverse matrices implies an approximation in the solution and also the need of the definition of the threshold for the eigenvalues of [katex]S_{ni}[/katex] for which eigenvalues smaller than this threshold have their inverses replaced with zeros. Too small values for this threshold determine significant numeric noise in the products; on the other hand, too large values of this threshold determine a loss of useful information. To overcome the problems related to the inversion of [katex]S_{ni}[/katex] a new formulation of the equations of the CDF has been presented in Ceccherini et al. (2022) [RD4]. This formulation can be derived by the Kalman filter method (Kalman, 1960 [RD5] and Ceccherini, 2022 [RD3]) and is equivalent to the original formulation of the CDF when the matrices [katex]S_{ni}[/katex] are not singular. The equations of the new formulation do not include the inverse matrices of [katex]S_{ni}[/katex], but they include the inverse matrices of the CMs of the total errors [katex]S_{i}[/katex], which are never singular. In order to distinguish the two formulations of the CDF we refer to the old formulation as CDF(2015) and to the new formulation as CDF(2022).
The equations of CDF(2022) are:
[katex] x_f = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \left( \sum_{i=1}^{N} S_i^{-1} a_i + S_a^{-1} x_a \right) [/katex]
[katex] A_f = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} S_i^{-1} A_i [/katex]
[katex]S_{nf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} \sum_{i=1}^{N} S_i^{-1} A_i \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]
[katex]S_{sf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1} S_a^{-1} \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]
[katex]S_f = S_{nf} + S_{sf} = \left( \sum_{i=1}^{N} S_i^{-1} A_i + S_a^{-1} \right)^{-1}[/katex]

Application of the CDF to Multi-target retrievals (MTRs)

The CDF algorithm described in the previous subsections is limited to retrieval products of a single atmospheric parameter, but it can be extended to deal with MTR products, whose state vectors include more atmospheric parameters. This extension is described in Tirelli et al. (2021) [RD9] for the formulation CDF(2015). For simplicity, we consider the data fusion between two products obtained by MTRs exactly co-located in space and time and referred to the same vertical grid. If the two retrieved state vectors contain the same parameters, the standard formulas of the CDF described in the previous subsections can be applied. If the two retrieved state vectors contain different parameters, but at least one parameter is common, then the inputs to the CDF have to be modified. We do the example of two instruments in which the state vector of the first one contains vertical profiles of the parameters P1 and P2 and the state vector of the second one contains vertical profiles of the parameters P1 and P3:
[katex] \hat{x}_1 = \begin{pmatrix} \hat{P}_{11} \\ \hat{P}_{21} \end{pmatrix} \quad \hat{x}_2 = \begin{pmatrix} \hat{P}_{12} \\ \hat{P}_{32} \end{pmatrix} [/katex]

The two vectors [katex]\hat{x}_1[/katex] and [katex]\hat{x}_2[/katex] are characterized by the AKMs [katex]A_1[/katex] and [katex]A_2[/katex] and by the noise CMs [katex]S_{n1}[/katex] and [katex]S_{n2}[/katex]. The structures of these matrices are the following:

[katex] A_1 = \begin{pmatrix} A_{11,1} & A_{12,1} \\ A_{21,1} & A_{22,1} \end{pmatrix} \qquad A_2 = \begin{pmatrix} A_{11,2} & A_{13,2} \\ A_{31,2} & A_{33,2} \end{pmatrix} [/katex]
[katex] S_{n1} = \begin{pmatrix} s_{11,n1} & s_{12,n1} \\ s_{21,n1} & s_{22,n1} \end{pmatrix} \qquad S_{n2} = \begin{pmatrix} s_{11,n2} & s_{13,n2} \\ s_{31,n2} & s_{33,n2} \end{pmatrix} [/katex]

Where:

[katex] A_{rq,i} = \frac{\partial \hat{P}_{ri}}{\partial P_q} \qquad i = 1,2 \quad r = 1,2,3 \quad q = 1,2,3 [/katex]
[katex] S_{rq,ni} = \left( (\hat{P}_{ri} - \langle \hat{P}_{ri} \rangle) (\hat{P}_{qi} - \langle \hat{P}_{qi} \rangle)^T \right) \qquad i = 1,2 \quad r = 1,2,3 \quad q = 1,2,3 [/katex]
To apply the CDF method, the state vectors are modified to be the union of the parameters retrieved from the different measurements and new AKMs and noise CMs are created, adding submatrices related to the non-retrieved parameters and considering that no information is retrieved for them. The new input vectors for the CDF, to be performed with CDF(2015), are:
[katex] \hat{x}'_{1} = \begin{pmatrix} \hat{P}_{11} \\ \hat{P}_{21} \\ 0 \end{pmatrix} \qquad \hat{x}'_{2} = \begin{pmatrix} \hat{P}_{12} \\ 0 \\ \hat{P}_{32} \end{pmatrix} [/katex]
[katex] A'_{1} = \begin{pmatrix} A_{11,1} & A_{12,1} & 0 \\ A_{21,1} & A_{22,1} & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad A'_{2} = \begin{pmatrix} A_{11,2} & 0 & A_{13,2} \\ 0 & 0 & 0 \\ A_{31,2} & 0 & A_{33,2} \end{pmatrix} [/katex]
[katex] S'_{n1} = \begin{pmatrix} s_{11,n1} & s_{12,n1} & 0 \\ s_{21,n1} & s_{22,n1} & 0 \\ 0 & 0 & 0 \end{pmatrix} \qquad S'_{n2} = \begin{pmatrix} s_{11,n2} & 0 & s_{13,n2} \\ 0 & 0 & 0 \\ s_{31,n2} & 0 & s_{33,n2} \end{pmatrix} [/katex]

Since these new noise CMs contain some rows and columns equal to zero, they are singular matrices and for their inversion it is needed to resort to the use of the generalized inverse. Using the new matrices (Eqs. (39)–(41)), as input to the CDF algorithm, we obtain a solution that contains elements in common and not in common:

[katex] \hat{x}_f = \begin{pmatrix} \hat{P}_{1} \\ P_{2f} \\ P_{3f} \end{pmatrix} \qquad A_f = \begin{pmatrix} A_{11,f} & A_{12,f} & A_{13,f} \\ A_{21,f} & A_{22,f} & A_{23,f} \\ A_{31,f} & A_{32,f} & A_{33,f} \end{pmatrix} \qquad S_{nf} = \begin{pmatrix} S_{11,nf} & S_{12,nf} & S_{13,nf} \\ S_{21,nf} & S_{22,nf} & S_{23,nf} \\ S_{31,nf} & S_{32,nf} & S_{33,nf} \end{pmatrix} [/katex]

CDF improves the knowledge of the common parameter, but it also improves the knowledge of the parameters that are observed only by one of the two instruments and the gain in the information content for the parameters not in common is directly connected to the level of correlation between the parameter in common and those not in common.