Flexible uncertainty quantification in medical imaging

12th CVPR Workshop on Medical Computer Vision
2026

Jeremias Sulam

50 years ago ...

first CT scan

ELECTRIC & MUSICAL INDUSTRIES

50 years ago ...

imaging

diagnostics

complete hardware & software description

human expert diagnosis and recommendations

imaging was "simple"

... 50 years forward

Data

Compute & Hardware

Sensors & Connectivity

Research & Engineering

... 50 years forward

data-driven  imaging

automatic analysis and rec.

societal implications

Data

Compute & Hardware

Sensors & Connectivity

Research & Engineering

data-driven  imaging

automatic analysis and rec.

societal implications

Problems in trustworthy biomedical imaging

inverse problems

uncertainty quantification

robustness

generalization

demographic fairness

hardware & protocol optimization

model-agnostic interpretability

policy & regulation

monitoring & auditing

data-driven  imaging

automatic analysis and rec.

societal implications

Problems in trustworthy biomedical imaging

inverse problems

uncertainty quantification

robustness

generalization

demographic fairness

hardware & protocol optimization

model-agnostic interpretability

policy & regulation

monitoring & auditing

in a box

Denoiser

Measurements

Reconstruction

Uncertainty Quantification in Inverse Problems

$\hat{x} = f_\theta(y)$

$\text{pixel}_j$

$\hat{x}_j$

(point predictors)

What is the uncertainty in the guess $\hat x_j$ ?

$y = Ax + \epsilon,~~\epsilon \sim \mathcal{N}(0, \sigma^2\mathbb{I})$

How do we report uncertainty rigorously?

Measurements

$y = Ax + \epsilon,~~\epsilon \sim \mathcal{N}(0, \sigma^2\mathbb{I})$

Uncertainty Quantification in Inverse Problems

$\hat{X} = F(y) \sim \mathcal{P}_y$

Sampling

in a box

Denoiser

$\text{pixel}_j$

$\hat{x}_j$

(predictive distribution)

What is the uncertainty in the guess $\hat x_j$ ?

How do we report uncertainty rigorously?

Mathematical tractability vs Complexity

in a box

simpler models

more assumptions

any model

no assumptions

Denoiser

Linear models

Linear networks

Shallow

ReLU Networks

Just ask GPT

Conformal guarantees

Bayesian

MC Dropout

$0$

$1$

$l(y)_j$

$u(y)_j$

Uncertainty through Prediction Sets

How do we construct them?

pixel-wise mean $\pm$ standard deviation
Quantile regression
MC-dropout (Gal & Ghahramani, 2016)
any other heuristics...

$C: y \mapsto C(y) \subseteq [0,1]^d$

$\mathcal C(y)_j = [l(y)_j,u(y)_j]$

Conformal Risk Control (CRC)

$\ell(y,x) = \frac{1}{d} \sum_{j\in[d]} \mathbf{1}\!\left\{x_j \notin C(y)_j\right\}$

$0$

$1$

$l(y)_j$

$u(y)_j$

Uncertainty through Prediction Sets

$C: y \mapsto C(y) \subseteq [0,1]^d$

ground truth!

$C(y)$ controls risk at level $\epsilon$ if

"On average, no more than $\epsilon$ pixels are outside the sets"

$\mathbb{E}\!\left[\ell\!\bigl(C(Y), X\bigr)\right] \le \epsilon$

$x_j$

Conformal Risk Control (CRC)

$0$

$1$

$l(y)_j$

$\mathcal C(y)_j$

$u(y)_j$

$\lambda$

$C(y)_j = [l_j(y),u_j(y)] \;\longrightarrow\; C_{\lambda}(y)_j = [l_j(y)-{\color{green}{\lambda}},\;u_j(y)+{\color{green}\lambda}]$

Given cal. set $S_\text{cal}=\{X^i,Y^i\}_{i=1}^n$ , let $\ell_\text{cal}(\lambda) = \frac1n\sum \ell(C_\lambda(Y^i,X^i))$

$\hat{\lambda} = \text{ smallest } \lambda ~~\text{so that}~~ \frac{n}{n+1 } \ell_\text{cal}(\lambda) + \frac1{1+n}\leq \epsilon$

Lemma

[Angelopoulos et al, 2024]

Then, $\mathbb{E}\bigl[\ell(C_\lambda(Y),X)\bigr] \le \epsilon$ .

Conformal Risk Control (CRC)

$0$

$1$

$l(y)_j$

$\mathcal C(y)_j$

$u(y)_j$

$\lambda$

$C_{\lambda}(y)_j = [l_j(y)-{\color{green}{\lambda}},\;u_j(y)+{\color{green}\lambda}]$

Given cal. set $S_\text{cal}=\{X^i,Y^i\}_{i=1}^n$ , let $\ell_\text{cal}(\lambda) = \frac1n\sum \ell(C_\lambda(Y^i,X^i))$

$\hat{\lambda} = \text{ smallest } \lambda ~~\text{so that}~~ \frac{n}{n+1 } \ell_\text{cal}(\lambda) + \frac1{1+n}\leq \epsilon$

Then, $\mathbb{E}\bigl[\ell(C_\lambda(Y),X)\bigr] \le \epsilon$ .

Lemma

[Angelopoulos et al, 2024]

High Dimensional Risk Control

$C_{\lambda}(y)_j = [l_j(y)-{\color{green}{\lambda}},\;u_j(y)+{\color{green}\lambda}]$

Observation 1: Single $\lambda$ for all $d$ dimensions... suboptimal

High dimensional alternative $\boldsymbol{\lambda} \in \mathbb R^d:$

$\boldsymbol{\lambda} = (\lambda_1, \lambda_2, \dots, \lambda_d)$

$C_{\lambda}(y)_j = [l_j(y)-{\color{green}{\lambda_j}},\;u_j(y)+{\color{green}\lambda_j}]$

Goal: minimize the mean interval length

$\min_{\lambda \in \mathbb{R}^d} \sum_{j \in [d]} \lambda_j \quad \text{s.t.} \quad \mathbb{E}\!\left[\ell(C_{\boldsymbol{\lambda}}(X),Y)\right] \leq \epsilon$

[Teneggi et al, 2023]

Semantic Risk Control

$C_{\lambda}(y)_j = [l_j(y)-{\color{green}{\lambda_j}},\;u_j(y)+{\color{green}\lambda_j}]$

Observation 2: High-dim data is heterogenous

Let $\lambda$ vary according to content/semantics (e.g. per organ via a segmentation model)

$C_{\boldsymbol \lambda}(y)_j = [l_j(y)-{\color{green}{\lambda_{s(y)_j}}},\;u_j(y)+{\color{green}\lambda_{s(y)_j}}]$

Segmentation model $s(y) : \mathcal Y \to [K]^d$

Semantic uncertainty $\boldsymbol{\lambda}_{\text{sem}} = (\lambda_1,\dots,\lambda_K) \in \mathbb R^K$

Semantic Risk Control

1. Find an anchor $\tilde{\lambda}_\text{sem}$ :

$\tilde{\lambda}_{\mathrm{sem}} = \underset{\lambda_{\mathrm{sem}} \in \mathbb{R}^{K}}{\arg\min} \; \sum_{k \in [K]} s_k \lambda_k \quad \text{s.t.} \quad \hat{\ell}^{\gamma}_{\mathrm{opt}} \!\left(\lambda_{\mathrm{sem}}\right) \le \epsilon$

$\hat{\ell}^\gamma$ : convex upper bound to $\ell(\lambda)$

2. Calibrate

$\hat{\lambda}_{\mathrm{sem}} = \tilde{\lambda}_{\mathrm{sem}} + \omega^\star \mathbf{1}_K,$ $\omega^\star = \inf \left\{ \omega \ge 0 : R_{\mathrm{cal}}^{+} \!\left( \tilde{\lambda}_{\mathrm{sem}} + \omega \mathbf{1}_K \right) \le \epsilon \right\}.$

$s_k:$ expected size of organ $k$

Semantic Risk Control

2. Calibrate

1. Find an anchor $\tilde{\lambda}_\text{sem}$ :

Guarantee

For any segmentation model $s(Y)\in[K]$, any $\epsilon > 0$ and exchangeable and independent calibration samples,

\[ \mathbb{E}\!\left[ \ell\!\left(\mathcal C_{\hat{\lambda}_{\mathrm{sem}}}(Y),X\right) \right] \le \epsilon . \]

Experiments

CT reconstruction on TotalSegmentor (Wasserthal et al. 2023)
Quantile Regression (Unet) for heuristic $l(y)$ and $u(y)$

Experiments

CT reconstruction on TotalSegmentor (Wasserthal et al. 2023)
Quantile Regression (Unet) for heuristic $l(y)$ and $u(y)$

Segmentation via SuPrem (Li, Yuille, and Zhou 2024)

spleen, kidneys, gallbladder, liver, stomach, aorta, inferior vena cava (IVC), pancreas

Semantic risk control

Semantic risk control

risk controlled uniformly for every organ

Recap

Conformal prediction allows for flexible UQ, with minimal assumptions
In high-dimensional settings, K-CRC allows for optimizing mean interval lengths
When samples are heterogeneous, semantic CRC allows for input-specific semantic calibration

Acknowledgements

Jacopo Teneggi

sem-CRC https://github.com/Sulam-Group/semantic_uq

K-CRC https://github.com/Sulam-Group/k-rcps

Funding: NSF CAREER Award CCF 2239787 and NIH R01CA287422

Teneggi, J., Tivnan, M., Stayman, W., & Sulam, J. 
How to trust your diffusion model: A convex optimization approach to conformal risk control. 
ICML 2023

Teneggi, J., Stayman, J. W., & Sulam, J. 
Conformal risk control for semantic uncertainty quantification in computed tomography. 
MICCAI 2025

Flexible uncertainty quantification in medical imaging

12th CVPR Workshop on Medical Computer Vision 2026

Jeremias Sulam

50 years ago ...

50 years ago ...

complete hardware & software description

human expert diagnosis and recommendations

imaging was "simple"

... 50 years forward

Data

Compute & Hardware

Sensors & Connectivity

Research & Engineering

... 50 years forward

Data

Compute & Hardware

Sensors & Connectivity

Research & Engineering

Problems in trustworthy biomedical imaging

inverse problems

uncertainty quantification

robustness

generalization

demographic fairness

hardware & protocol optimization

model-agnostic interpretability

policy & regulation

monitoring & auditing

Problems in trustworthy biomedical imaging

inverse problems

uncertainty quantification

robustness

generalization

demographic fairness

hardware & protocol optimization

model-agnostic interpretability

policy & regulation

monitoring & auditing

in a box

Denoiser

Measurements

Reconstruction

Uncertainty Quantification in Inverse Problems

Measurements

Uncertainty Quantification in Inverse Problems

Sampling

in a box

Denoiser

Mathematical tractability vs Complexity

in a box

simpler models

more assumptions

any model

no assumptions

Denoiser

Linear models

Linear networks

Shallow

ReLU Networks

Just ask GPT

Conformal guarantees

Bayesian MC Dropout

Uncertainty through Prediction Sets

How do we construct them?

Conformal Risk Control (CRC)

Uncertainty through Prediction Sets

Conformal Risk Control (CRC)

Lemma

Conformal Risk Control (CRC)

Lemma

High Dimensional Risk Control

Semantic Risk Control

Semantic Risk Control

Semantic Risk Control

Guarantee

Experiments

Experiments

Semantic risk control

Semantic risk control

Recap

12th CVPR Workshop on Medical Computer Vision
2026

Bayesian

MC Dropout