データ同化の数学解析と

数値解析的な課題

○竹田 航太 (京都大学)    , 坂上 貴之 (京都大学)

RIMS共同研究 (公開型)計算科学に資する数値解析学の展開

2024年10月24日 (木) @京都大学 数理解析研究所 420号室

The first author is supported by RIKEN Junior Research Associate Program and JST SPRING JPMJSP2110.

The second author is supported by JST MIRAI JPMJMI22G1.

Data Assimilation (DA)

Seamless integration of data into numerical models

Model

Dynamical systems and its numerical simulations

ODE, PDE, SDE,

Numerical analysis, HPC,

Fluid mechanics, Physics

Data

Obtained from experiments or measurements

Statistics, Data analysis,

Radiosonde/Radar/Satellite/ observation

×

\partial_t \bm{u} + (\bm{u} \cdot \nabla) \bm{u} = - \frac{1}{\rho} \nabla p + \nu \triangle \bm{u} + \bm{f}

DA

Bayesian

Variational

Control

Filtering

EnKF

...

...

...

←Today

Setup

\frac{d x^\dagger}{dt} = \mathcal{F}(x^\dagger)
(\mathcal{F}: \mathbb{R}^{N_x} \rightarrow \mathbb{R}^{N_x})
y_j = H x^\dagger_j + \xi_j
(H: \mathbb{R}^{N_y \times N_x}, \xi_j \in \mathbb{R}^{N_y} \text{ noise})
x^\dagger_j = x^\dagger(jh)

Model dynamics in

Observation in

\mathbb{R}^{N_x}
\mathbb{R}^{N_y}
(h > 0 \text{: obs. interval})

unknown

(t_j = jh)

time

Aim

x^\dagger_j

Unkown

Given

Y_j = \{y_i \mid i \le j\}
x_j

Construct

x^\dagger_j

estimate

Quality Assessment

\mathbb{E}\left[|x_j - x^\dagger_j|^2\right]
(\text{i.e. } \mathbb{P}^{x_j} \approx \mathbb{P}^{x^\dagger_j}({}\cdot{} | Y_j))
y_j
\mathbb{R}^{N_x}
\mathbb{R}^{N_y}
x^\dagger_{j-1}
x^\dagger_j
y_{j-1}

Given

Model

Obs.

Obs.

unknown

Sequential Data Assimilation

\mathbb{P}^{x_{j-1}}
\mathbb{P}^{\hat{x}_j}
\mathbb{P}^{x_j}

(I) Prediction

by Bayes' rule

by model

(II) Analysis

y_j

We can construct the exact distribution by 2 steps at each time step.

Prediction

...

\mathbb{P}^{x_1}
\mathbb{P}^{\hat{x}_1}
\mathbb{P}^{x_0}

Data Assimilation Algorithms

Only focus on Approximate Gaussian Algorithms

Data Assimilation Algorithms

\mathbb{P}^{x_j} \approx \frac{1}{m} \sum_{k=1}^m \delta_{x_j^{(k)}}.

Kalman filter(KF) [Kalman 1960]

  • All Gaussian distribution
  • Update of mean and covariance.

Assume: linear F, Gaussian noises

Ensemble Kalman filter(EnKF) 

Non-linear extension by Monte Carlo method

  •  
  • Update of ensemble
\mathbb{P}^{x_j} = \mathcal{N}(m_j, C_j).
X_j = [x_j^{(1)},\dots,x_j^{(m)}].

Data Assimilation Algorithms

Kalman filter(KF)

Assume: linear F, Gaussian noises

Ensemble Kalman filter (EnKF) 

Non-linear extension by Monte Carlo method

m_j, C_j
X_j
\widehat{m}_j, \widehat{C}_j
m_{j-1}, C_{j-1}
y_j
X_{j-1}
\widehat{X}_j
y_j
X_j = [x_j^{(k)}]_{k=1}^m, \widehat{X}_j = [\widehat{x}_j^{(k)}]_{k=1}^m.

Linear F

\text{nonlinear } \mathcal{F}

Bayes' rule

\widehat{m}_j, \widehat{C}_j
m_j, C_j

approx.

ensemble

\approx
\approx

predict

analysis

  • Two major implementations:
    • Perturbed Observation (PO): stochastic
    • Ensemble Transform Kalman Filter (ETKF): deterministic

EnKF (illustration)

PO (stochastic EnKF)

\widehat{x}^{(k)}_j = \Psi(x^{(k)}_{j-1}), \quad k = 1, \dots, m.
(I) X_{j-1} \rightarrow \widehat{X}_j: \text{ Evolve each member by model dynamics } \Psi,

(Burgers+1998) G. Burgers, P. J. van Leeuwen, and G. Evensen, Analysis Scheme in the Ensemble Kalman Filter,684, Mon. Weather Rev., 126 (1998), pp. 1719–1724.

Stochastic!

y^{(k)}_j = y_j + \xi_j^{(k)}, \quad \xi_j^{(k)} \sim N(0, R), \quad k = 1, \dots, m.
(II) \widehat{X}_j, y_j \rightarrow X_j: \text{Replicate observations,}\\
x_j^{(k)} = (I - K_j H) \widehat{x}_j^{(k)} + K_j y_j^{(k)}, \quad k = 1, \dots, m,
\text{Then, update each member,}
\text{where } K_j = \widehat{C}_j H^\top (H \widehat{C}_j H^\top + R)^{-1}, \widehat{C}_j = \operatorname{Cov}(\widehat{X}_j),
(\bigstar)

Perturbed Observation

ETKF (deterministic EnKF)

\widehat{x}^{(k)}_j = \Psi(x^{(k)}_{j-1}), \quad k = 1, \dots, m.
(I) X_{j-1} \rightarrow \widehat{X}_j: \text{ Evolve each member by model dynamics } \Psi,

(Bishop+2001)C. H. Bishop, B. J. Etherton, and S. J. Majumdar, Adaptive Sampling with the Ensemble Transform Kalman Filter. Part I: Theoretical Aspects, Mon. Weather Rev., 129 (2001), pp. 420–436.

Deterministic!

\text{with the transform matrix } T_j \in \mathbb{R}^{m \times m} \text{ s.t. }
(II) \widehat{X}_j, y_j \rightarrow X_j: \text{Decompose } \widehat{X}_j =\overline{\widehat{x}}_j + d\widehat{X}_j.\\
\overline{x}_j = (I - K_j H)\overline{\widehat{x}}_j + K_j y_j , \quad dX_j = d\widehat{X}_j T_j.
\text{Then, update separately,}
\frac{1}{m-1} d\widehat{X}_j T_j (d\widehat{X}_j T_j)^* = (I - K_j H) \widehat{C}_j.

The explicit representation exists!

(\bigstar)

Ensemble Transform Kalman Filter

Inflation (numerical technique)

Issue: Underestimation of covariance through the DA cycle

→ Poor state estimation → (idea) inflating covariance before step (II)

(I) Prediction → inflation  (II) Analysis → ...

Rank is improved

Rank is not improved

Two basic methods of inflation

\text{Additive } (\alpha \ge 0)
\text{Multiplicative } (\alpha \ge 1)
\text{ETKF}: d\widehat{X}_j \rightarrow \alpha d\widehat{X}_j.
\text{where } d\widehat{X}_j = [\widehat{x}^{(k)}_j - \overline{\widehat{x}}_j]_{k=1}^m, \overline{\widehat{x}}_j = \frac{1}{m} \sum_{k=1}^m \widehat{x}^{(k)}_j.
\text{PO}: \widehat{C}_j \rightarrow \alpha^2\widehat{C}_j,
\text{Only for PO}: \widehat{C}_j \rightarrow \widehat{C}_j + \alpha^2 I_{N_x}.

Literature review

  • Consistency (Mandel+2011, Kwiatkowski+2015)

  • Sampling errors (Al-Ghattas+2024)

  • Error analysis (full observation                )

    • PO + add. inflation (Kelly+2014)

    • ETKF + multi. inflation (T.+2024) → current

  • Error analysis (partial observation) → future

(Kelly+2014) D. T. B. Kelly, K. J. H. Law, and A. M. Stuart (2014), Well-posedness and accuracy of the ensemble Kalman filter in discrete and continuous time, Nonlinearity, 27, pp. 2579–260.

(Al-Ghattas+2024) O. Al-Ghattas and D. Sanz-Alonso  (2024), Non-asymptotic analysis of ensemble Kalman updates: Effective dimension and localization, Information and Inference: A Journal of the IMA, 13.

(Kwiatkowski+2015) E. Kwiatkowski and J. Mandel (2015), Convergence of the square root ensemble Kalman filter in the large ensemble limit, Siam-Asa J. Uncertain. Quantif., 3, pp. 1–17.

(Mandel+2011) J. Mandel, L. Cobb, and J. D. Beezley (2011), On the convergence of the ensemble Kalman filter, Appl.739 Math., 56, pp. 533–541.

H = I_{N_x}

How EnKF approximate KF?

How EnKF approximate

the true state?

Mathematical analysis of EnKF

Assumptions on model

(Reasonable)

Assumption on observation

(Strong)

Assumptions for analysis

N_y = N_x, H = I_{N_x}, \xi \sim \mathcal{N}(0, r^2 I_{N_x})

Assumption (obs-1)

"full observation"

Strong!

\exist \rho > 0 \text{ s.t. } \forall x_0 \in B(\rho), \forall t > 0, x(t ; x_0) \in B(\rho).
\exist \beta > 0 \text{ s.t. } \forall x \in B(\rho), \forall x' \in \mathbb{R}^{N_x}, \langle F(x) - F(x'), x - x' \rangle \le \beta | x - x'|^2.
\exist \beta > 0 \text{ s.t. } \forall x, x' \in B(\rho), |F(x) - F(x')| \le \beta | x - x'|.

Assumption (model-1)

Assumption (model-2)

Assumption (model-2')

\forall x_0 \in \R^{N_x}, \dot{x}(t) = F(x(t)) \text{ has an unique solution, and}

Reasonable

(B(\rho) = \{x \in \R^{N_x} \mid |x| \le \rho \})

"one-sided Lipschitz"

"Lipschitz"

"bounded"

Dissipative dynamical system

Dissipative dynamical system

Dissipative dynamical systems

Lorenz 63, 96 equations, widely used in geoscience,

\frac{dx^i}{dt} = (x^{i+1} - x^{i-2}) x^{i-1} - x^i + f \quad (i = 1, \dots, 40),

Example (Lorenz 96)

\bm{x} = (x^i)_{i=1}^{N_x} \in \mathbb{R}^{N_x}, f \in \R,

non-linear conserving

linear dissipating

forcing

Assumptions (model-1, 2, 2')

hold

incompressible 2D Navier-Stokes equations (2D-NSE) in infinite-dim. setting

Lorenz 63 eq. can be written in the same form.

→ We can discuss ODE and PDE in a unified framework.

can be generalized

\frac{dx}{dt} =
B(x, x)
+
A x
+
f.

bilinear operator

linear operator

dissipative dynamical systems

Dissipative dynamical systems

The orbits of the Lorenz 63 equation.

Error analysis of PO (prev.)

\text{Let us consider the PO method with additive inflation } \alpha > 0.
\text{Let } e^{(k)}_j = x^{(k)}_j - x^\dagger_j \text{ for } k = 1, \dots, m.

variance of observation noise

uniformly in time

(Kelly+2014) D. Kelly et al. (2014), Nonlinearity, 27(10), 2579–2603.

\widehat{C}_j \rightarrow \widehat{C}_j + \alpha^2 I_{N_x}

Strong effect! (improve rank)

Theorem 1 (Kelly+2014)

\limsup_{j \rightarrow \infty} \mathbb{E} [|e^{(k)}_j|^2] \le \frac{2\min\{m, N_x\}}{1-\theta}r^2
\text{Then, } \exist \theta = \theta(\beta, h, r, \alpha) \in (0, 1), \text{ s.t. }
\text{Let Assumptions (obs-1, model-1,2) hold and }\alpha \gg 1.
\text{for } k = 1, \dots, m.

Error analysis of ETKF (our)

\text{Let } e_j =\overline{x}_j - x^\dagger_j\text{ where } \overline{x}_j = \frac{1}{m} \sum_{k=1}^m x^{(k)}_j.
\text{Let us consider the ETKF with multiplicative inflation } \alpha \ge 1.

(T.+2024) K. T. & T. Sakajo (accepted 2024), SIAM/ASA Journal on Uncertainty Quantification.

\limsup_{j \rightarrow \infty} \mathbb{E}[|e_j|^2] \le \frac{N_x r^2}{1-\theta} + \left(\frac{1 - \Theta}{1 - \theta} - 1\right)D.

Theorem 2 (T.+2024)

\text{where } \Theta = \Theta(\beta, \rho, h, r, m, \alpha) , D = D(\beta, \rho) > 0.
\text{Then, } \exist \theta = \theta(\beta, \rho, h, r, m, \alpha) < 1\text{ s.t.}
\text{Let Assumptions (obs-1, model-1,2') hold.}
\text{Suppose } m > N_x, \operatorname{rank}(C_0) = N_x,\alpha \gg 1, \text{ and}
x_j^{(k)} \in B(\rho) \text{ for } k = 1, \dots, m \text{ and } j \in \N.

Sketch: Error analysis of ETKF (our)

Strategy: estimate the changes in DA cycle

\widehat{e}_j
(\widehat{e}_j = \overline{\widehat{x}}_j - x^\dagger_j)
(I)
(II)

expanded

by dynamics

contract &

noise added

e_{j-1}
e_j

Applying Gronwall's lemma to the model dynamics,

(I)
| \widehat{e}_j | \le e^{\beta h} |e_{j-1}|

+ {additional term}.

\overline{x}_j = (I - K_j H)\overline{\widehat{x}}_j + K_j y_j
e_j = (I - K_j H) \widehat{e}_j + K_j (y_j - H x^\dagger_j).

subtracting          yields

(II)

From the Kalman update relation

(\bigstar)
x^\dagger_j

prediction error

② contraction

③ obs. noise effect

estimated by op.-norm

Sketch: Error analysis of ETKF (our)

Key relations in

Using Woodbury's lemma

I - K_j H = (I + \widehat{C}_j H^\top R^{-1} H)^{-1}

② contraction

③ obs. noise

(II)
\| \cdot \|_{op}: \text{operator norm}
\|I - K_j H\|_{op} = \|(I + \widehat{C}_j H^\top R^{-1} H)^{-1}\|_{op}

Contraction (see the next slide)

= \|(I + \alpha^2 r^{-2}\widehat{C}_j)^{-1}\|_{op}.
H = I

assumption

\|K_j\|_{op} = \|(I + \alpha^2 r^{-2}\widehat{C}_j)^{-1} \alpha^2 r^{-2}\widehat{C}_j\|_{op} \le 1.
(\forall \alpha, r \ge 0, \widehat{C}_j \text{: positive-definite})

Sketch: Error analysis of ETKF (our)

< e^{-\beta h}
= \frac{1}{1 + \alpha^2 r^{-2} \lambda_{min}(\widehat{C}_j)}

Key result

\because
(1) \text{ We establish a lower bound of eigenvalues of } \widehat{C}_j \text{ by } \alpha \gg 1.
(2) \text{ If } \widehat{C}_j \text{ is positive-difinite, we have a bound by } \alpha \gg 1.

Summary 1

\limsup_{j \rightarrow \infty} \mathbb{E}[|e^{(k)}_j|^2] \le \frac{2\min\{m, N_x\}}{1-\theta}r^2,
\limsup_{j \rightarrow \infty} \mathbb{E}[|e_j|^2] \le \frac{N_x r^2}{1-\theta} + \left(\frac{1 - \Theta}{1 - \theta} - 1\right)D.
\text{Suppose } m > N_x \text{ and } \alpha \gg 1,
\text{Suppose } \alpha \gg 1,

ETKF with multiplicative inflation (our)

PO with additive inflation (prev.)

= O(r^2)
r \rightarrow 0

accurate observation limit

= O(r^2)
O(r^4)
\text{for } k = 1, \dots, m.

Two approximate Gaussian filters (EnKF) are assessed in terms of the state estimation error.

due to

due to

Further/Future

This study

1. Partial observation

2. Small ensemble

m \le N_x
N_y < N_x, H \in \mathbb{R}^{N_y \times N_x}

3. Infinite-dimension

\text{state space } \mathcal{X}: \text{Hilbert space},
\text{observation space } \mathcal{Y} \subset \mathcal{X}.

Further/Future

\|(I + \widehat{C}_j H^\top R^{-1} H)^{-1}\|_{op} \ge 1.
\text{If } m \le N_x \text{ or } N_y < N_x, H \in \mathbb{R}^{N_y \times N_x}, R \in \mathbb{R}^{N_y \times N_y},

Issue 1 (rank deficient)

\text{then,}

② Error contraction in (II)

1. Partial observation

2. Small ensemble

Further/Future

\|(I + \widehat{C}_j H^\top R^{-1} H)^{-1} \widehat{C}_j H^\top R^{-1} H\|_{op} \le 1.
\text{Let } N_y < N_x, H \in \mathbb{R}^{N_y \times N_x}, R \in \mathbb{R}^{N_y \times N_y}.

Issue 2 (non-symmetric)

\text{If } \widehat{C}_j H^\top R^{-1} H \text{ is symmetric then,}

③ bound of obs. noise

If non-symmetric, it doesn't hold in general.

Example

A = \left[ \begin{array}{cc} a & 0 \\ b & 0 \end{array} \right] \Rightarrow \|(I + A)^{-1} A\|_{op} = \frac{\sqrt{a^2 + b^2}}{1+a} > 1.
(a > 0, b^2 \ge 1 + 2a)

Further/Future

Infinite-dimension

Theorem (Kelly+2014) for PO + add. infl.

Theorem (T.+2024) for ETKF + multi. infl.

\text{state space } \mathcal{X}: \text{Hilbert space},
\text{observation space } \mathcal{Y} \subset \mathcal{X} \text{: Hilbert space}.

(     improve rank)

\because

still holds

doesn't hold

(due to Issue 1: rank deficient)

※with full-observations

(ex: 2D-NSE with the Sobolev space)

Further result for partial obs.

The projected PO + add. infl. with partial obs. for Lorenz 96

Theorem 3 (T.)

\limsup_{j \rightarrow \infty} \mathbb{E}[\| e^{(k)}_j \|^2] \le \frac{4\min\{m, N_x\}}{1-\theta}r^2
\text{For } h \ll 1 \text{ and } \alpha \gg 1, \exist \theta = \theta(\beta, h, r, \alpha) \in (0, 1), \text{ s.t. }
\text{for } k = 1, \dots, m.
\text{Let } N_x = 3N \text{ for } N \in \mathbb{N}, H = [\phi_1, \phi_2, 0, \phi_4, \phi_5, 0, \phi_7, \dots, 0] \in \mathbb{R}^{N_x \times N_x},
\text{where } (\phi_i)_{i=1}^{N_x} \text{ is a standard basis of } \mathbb{R}^{N_x}, \text{ and let } \|\cdot\| = \sqrt{|\cdot|^2 + |H\cdot|^2}.
\text{Let } x_j^{(k)} \in B(\rho) \text{ for } k = 1, \dots, m, \text{ and } j \in \N.

The projected add. infl.:

\widehat{C}_j \rightarrow H(\widehat{C}_j + \alpha^2 I_{N_x})H^*

key 2

key 1

Further/Future

This study

Problems on

numerical analysis

other directions

1. Partial observation

2. Small ensemble

m \le N_x
N_y < N_x, H \in \mathbb{R}^{N_y \times N_x}

3. Infinite-dimension

\text{state space } \mathcal{X}: \text{Hilbert space},
\text{observation space } \mathcal{Y} \subset \mathcal{X}.

Issue 1, 2

Issue 1

Issue 1

Problems on numerical analysis

  1. Discretization/Model errors
  2. Dimension-reduction for dissipative dynamics
  3. Numerics for Uncertainty Quantification (UQ)
  4. Imbalance problems

1. Discretization/Model errors

Motivation

- Including discretization errors in the error analysis.

- Consider more general model errors and ML models.

- Error analysis with ML/surrogate models.

Cf.: S. Reich & C. Cotter (2015), Probabilistic Forecasting and Bayesian Data Assimilation, Cambridge University Press.

 

Question

- Can we introduce the errors as random noises in models?

- Can minor modifications to the existing error analysis work?

\widetilde{\Psi}_t(x) = \Psi_t(x) + \xi_t ?

2. Dimension-reduction for dissipative dynamics

Motivation

- Defining "essential dimension"                      and relaxing the condition:

-          should be independent of discretization.

Cf.: C. González-Tokman & B. R. Hunt (2013), Ensemble data assimilation for hyperbolic systems, Physica D: Nonlinear Phenomena, 243(1), 128-142.

N_{ess} \le N_x
m > N_x \rightarrow m > N_{ess}
N_{ess}

Question

- Dissipativeness, dimension of the attractor?

- Does the dimension depend on time and state? (hetero chaos)

- Related to the SVD and the low-rank approximation?

Attractor

3. Numerics for UQ

Motivation

- Evaluating physical parameters in theory.

- Computing the Lyapunov spectrum of atmospheric models.

- Estimating unstable modes (e.g., singular vectors of Jacobian).

Cf.: T. J. Bridges & S. Reich (2001), Computing Lyapunov exponents on a Stiefel manifold. Physica D, 156, 219–238.

F. Ginelli et al. (2007), Characterizing Dynamics with Covariant Lyapunov Vectors, Phys. Rev. Lett. 99(13), 130601.

Y. Saiki & M. U. Kobayashi (2010), Numerical identification of nonhyperbolicity of the Lorenz system through Lyapunov vectors, JSIAM Letters, 2, 107-110.

Question

- Can we approximate them only using ensemble simulations?

- Can we estimate their approximation errors?

4. Imbalance problems

Motivation

- Assimilating observations can break the balance relations inherited from dynamics. → numerical divergence

(e.g., the geostrophic balance and gravity waves)

- Explaining the mechanism (sparsity or noise of observations?).

- Only ad hoc remedies (Initialization, IAU).

Cf.: E. Kalney (2003), Atmospheric modeling, data assimilation and predictability, Cambridge University Press.

Hastermann et al. (2021), Balanced data assimilation for highly oscillatory mechanical systems, Communications in Applied Mathematics and Computational Science, 16(1), 119–154.

Question

- Need to define a space of balanced states?

- Need to discuss the analysis state belonging to the space?

- Can we utilize structure-preserving schemes?

balanced states

4. Imbalance problems

Initialization (モデルを安定化する方法)→ 情報を落とすので使いたくない

(1) Nonlinear Normal Mode Initialization (NNMI): C.E. Leith (1980), J. Atmos. Sci., 37, 958–968.

(2) digital filter initialization (DFI): モデルにフィルターを入れる.

その他数値テクニック

(3) 鉛直方向だけ陰解法

(4) split explicit: dtを項ごとに変える(音波だけ細かくする等)

最近のDAでは使われていない

- 4DVarではDF由来のコスト関数を入れられる

- EnKFではモデルを破綻させるモードは出ないはず(subspace property)

- EnKFでは,localizationで問題が起きるかも?(理論的に示されている : J. D. Kepert (2009), Q.J.R. Meteorol. Soc., 135: 1157-1176. )

Summary 2

1. Partial obs.

Future plans

2. Small ens.

3. Infinite-dim.

Mathematical analysis of EnKF

rank deficient

non-symmetric

DA × Numerical analysis

Model errors

Dimension-reduction

Numerical UQ

Imbalance

Other DA

Particle Filter

4DVar