AMAT: Medial Axis Transform for Natural Images

Stavros Tsogkas

Sven Dickinson

Outline

Medial axis transform for binary shapes
Previous work on medial point detection
Extending MAT to natural images
Results and future work

Medial Axis Transform (MAT)

MAT

A transformation for extracting new descriptors of shape, H. Blum, Models for the perception of speech and visual form, 1967

Early 2D approaches

Brady and Asada (1986)

Rom and Medioni (1993)

Zhu and Yuille (1996)

Siddiqi, Shokoufandeh, Dickinson, and Zucker (1998)

Bai, Latecki, and Liu (2007)

Macrini, Dickinson, Fleet, and Siddiqi (2011)

Shape matching and recognition

Recognition of shapes by editing shock graphs, Sebastian et al., ICCV 2001

Branches correspond to object parts

Shape simplification

Q-MAT: Computing medial axis transform by quadratic error minimization, Li et al., Transactions on Graphics, 2015

QMAT demo

Q-MAT: Computing medial axis transform by quadratic error minimization, Li et al., Transactions on Graphics, 2015

Shape manipulation

Medial-axis-driven shape deformation with volume preservation,

Lan et al., The Visual Computer, 2017

Shape manipulation

Medial-axis-driven shape deformation with volume preservation,

Lan et al., The Visual Computer, 2017

Why not 2D colour images then?

Lack of generalized definition.
Hard to obtain annotations for learning.
Foreground/background axes?
Symmetry at multiple scales.

MAT for natural images is not obvious

Superpixels as deformable maximal disks

Multiscale Symmetric Part Detection and Grouping,

A. Levinshtein, C. Sminchisescu, and S. Dickinson, ICCV, 2009

Medial point detection

Image from BSDS300

Ground-truth segmentation

Ground-truth skeleton

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, D. Martin, C. Fowlkes, J.Malik, ICCV 2001.

Ideas from boundary detection

Learning to detect natural image boundaries using local brightness, color, and texture cues, D. Martin, C. Fowlkes, J. Malik, TPAMI 2004

From edges to symmetry axes

Learning-based symmetry detection for natural images, S. Tsogkas, I. Kokkinos, ECCV 2012.

Multiple scales and orientations

Orientation

Scale

Symmetry probability

NMS

Object skeleton detection

Only foreground objects considered.
Images are iconic (centered objects).

Object skeleton extraction in natural images by fusing scale-associated deep side outputs, Shen et al, CVPR 2015

The good...

No need for closed boundaries or object masks.
Tackle challenging cases (e.g. curved contours).

The bad...

No scale information.
Isolated responses.
Grouping is a challenging problem.

MAT should be invertible

MAT^{-1}

Maximally inscribed disks reveal scale

(x_i,y_i)

r_i

Generative definition of medial disks

\mathbf{p}_i

r_i

\mathbf{p}_j

r_j

f: summarizes patch (encoding)
g: reconstructs patch (decoding)

D^I_{\mathbf{p}_i,r_i}

D^I_{\mathbf{p}_j,r_j}

\tilde{D}^I_{\mathbf{p}_j,r_j}

\tilde{D}^I_{\mathbf{p}_i,r_i}

[\bar{R}_i,\bar{G}_i,\bar{B}_i]

[\bar{R}_j,\bar{G}_j,\bar{B}_j]

Maximal disks have low reconstruction error

)\approx 0

) \gg 0

D^I_{\mathbf{p}_i,r_i}

\tilde{D}^I_{\mathbf{p}_i,r_i}

D^I_{\mathbf{p}_j,r_j}

\tilde{D}^I_{\mathbf{p}_j,r_j}

Redundancies in image statistics

Superpixel: Locally uniform appearance

SLIC superpixels, Achanta et al., TPAMI 2012

AppearanceMAT definition

...

for all p,r

\min_{\mathbf{p},r} \sum_{i=1}^m e_{\mathbf{p}_i,r_i}

g \circ f \circ D

I=\bigcup_{i=1}^m D^I_{\mathbf{p}_i,r_i}

AppearanceMAT definition

2.\, e_{\mathbf{p},r} = || \tilde{D}^I_{\mathbf{p},r} - D^I_{\mathbf{p},r} ||^2 \ge 0

3.\quad MAT=\{ (\mathbf{p}_1,r_1,\mathbf{f}_1),\ldots,(\mathbf{p}_m,r_m,\mathbf{f}_m) \}:\min_{\mathbf{p},r} \sum_{i=1}^m e_{\mathbf{p}_i,r_i}

1.\, \tilde{D}^I_{\mathbf{p},r} = g\circ f\circ D^I_{\mathbf{p},r},\, \forall \mathbf{p},r

4.\, I=\bigcup_{i=1}^m D^I_{\mathbf{p}_i,r_i}

A for "appearance"

A trivial solution

Select pixels as medial points (disks of radius 1).

Perfect reconstruction quality!

Not very useful in practice...

Goal: balance between sparsity and reconstruction

Dense representation
Low reconstruction error

Sparse representation
High reconstruction error

Favor the selection of larger disks...

Increasing \( w \)

Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + \orange{w}(\frac{1}{r}) \).

...as long as they do not incur a high reconstruction error

Sparsity-quality trade-off

Increasing w

Select all pixels as medial points (disks of radius 1).
- Perfect reconstruction quality.
- Not useful practically.

Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + w(\frac{1}{r}) \).

AMAT is a weighted geometric set cover problem

WGSC is NP-hard!

PTAS exist

\Bigg\{

Set we want to cover

Covering elements (range)

Set costs

Cover 2D image

using disks of radii {1,...,R}

with costs \( c_{\mathbf{p},r}\)

AMAT is a weighted geometric set cover problem

Cover 2D image using disks of radii {1,...,R}, with costs \( c_{\mathbf{p},r}\)

WGSC is NP-hard!

PTAS exist

\Bigg\{

Set we want to cover

Covering elements (range)

Set costs

Heuristic cost function

Greedy algorithm

Compute all costs \( c_{\mathbf{p},r} \).
While image has not been completely covered:
- Select disk \( D_{\mathbf{p^*},r^*} \) with lowest cost.
- Add point \( (\mathbf{p^*},r^*,\mathbf{f}_{\mathbf{p^*},r^*}) \) to the solution.
- Mark disk pixels as covered.
- Update costs \( c_{\mathbf{p},r} \)

Approximation algorithms, Vijay V. Vazirani

AMAT Demo

Texture makes the problem harder

Image smoothing via L0-gradient minimization, Xu et al., SIGGRAPH 2011

Grouping points together...

space proximity
smooth scale variation
color similarity

~~color similarity~~

Input

AMAT

Groups

(color coded)

...opens up possibilities

Thinning

Segmentation

Object proposals
and more...

BMAX500 annotations

Image from BSDS500

Ground-truth segmentation

BMAX500

SYMMAX300

Extract skeletons of all segments in the ground-truth

Qualitative results

Input

AMAT

Groups

Ground-truth

Quantitative results (BMAX500)

Medial point detection	Precision	Recall	F-measure
MIL	0.49	0.55	0.52
AMAT	0.52	0.63	0.57
Human	0.89	0.66	0.77

Reconstruction	MSE	PSNR (dB)	SSIM	Compression
MIL	0.0258	16.6	0.53	20x
GT-seg	0.0149	18.87	0.64	9x
GT-skel	0.0114	20.19	0.67	14x
AMAT	0.0058	22.74	0.74	11x

Reconstruction results

Input

MIL

GT-seg

GT-skel

AMAT

More reconstruction results

Input

MIL

GT-seg

GT-skel

AMAT

Summary

Generalization of MAT for natural images.
Beyond medial point detection:
- Scale + appearance information.
- Group points into connected skeletal components.
Completely unsupervised.
Balance between compactness and reconstruction.
Use ~10% points of the input image.

Applications

Painterly rendering

Interactive segmentation

Constrained image editing

Limitations and future work

Better texture reconstruction.
More powerful encoding and decoding functions.
Parameterize relative roles of shape and appearance.
- Flexible point grouping.
- Segmentations at different granularities.
Parallelize greedy algorithm.

AMAT

Slides for the paper AMAT: Medial Axis Transform for Natural Images

3,455

Stavros Tsogkas

Research Scientist at the Samsung AI Center in Toronto. Research associate at the University of Toronto.

tsogkas.github.io

AMAT: Medial Axis Transform for Natural Images

Stavros Tsogkas

Sven Dickinson

Outline

Medial axis transform for binary shapes

Previous work on medial point detection

Extending MAT to natural images

Results and future work

Medial Axis Transform (MAT)

Early 2D approaches

Shape matching and recognition

Branches correspond to object parts

Shape simplification

QMAT demo

Shape manipulation

Shape manipulation

Why not 2D colour images then?

Lack of generalized definition.

Hard to obtain annotations for learning.

Foreground/background axes?

Symmetry at multiple scales.

MAT for natural images is not obvious

Superpixels as deformable maximal disks

Medial point detection

Ideas from boundary detection

From edges to symmetry axes

Multiple scales and orientations

Object skeleton detection

Only foreground objects considered.

Images are iconic (centered objects).

The good...

No need for closed boundaries or object masks.

Tackle challenging cases (e.g. curved contours).

The bad...

No scale information.

Isolated responses.

Grouping is a challenging problem.

MAT should be invertible

Maximally inscribed disks reveal scale

Generative definition of medial disks

Maximal disks have low reconstruction error

Redundancies in image statistics

Superpixel: Locally uniform appearance

AppearanceMAT definition

...

for all p,r

AppearanceMAT definition

A for "appearance"

A trivial solution

Select pixels as medial points (disks of radius 1).

Perfect reconstruction quality!

Not very useful in practice...

Goal: balance between sparsity and reconstruction

Favor the selection of larger disks...

Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + \orange{w}(\frac{1}{r}) \).

...as long as they do not incur a high reconstruction error

Sparsity-quality trade-off

Select all pixels as medial points (disks of radius 1).

Perfect reconstruction quality.

Not useful practically.

Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + w(\frac{1}{r}) \).

AMAT is a weighted geometric set cover problem

Cover 2D image

using disks of radii {1,...,R}

with costs \( c_{\mathbf{p},r}\)

AMAT is a weighted geometric set cover problem

Cover 2D image using disks of radii {1,...,R}, with costs \( c_{\mathbf{p},r}\)

Heuristic cost function

Greedy algorithm

Compute all costs \( c_{\mathbf{p},r} \).

While image has not been completely covered:

Select disk \( D_{\mathbf{p^*},r^*} \) with lowest cost.

Add point \( (\mathbf{p^*},r^*,\mathbf{f}_{\mathbf{p^*},r^*}) \) to the solution.

Mark disk pixels as covered.

Update costs \( c_{\mathbf{p},r} \)

AMAT Demo

AMAT Demo

Texture makes the problem harder

Grouping points together...

...opens up possibilities

Select disk \( D_{\mathbf{p^},r^} \) with lowest cost.

Add point \( (\mathbf{p^},r^,\mathbf{f}_{\mathbf{p^},r^}) \) to the solution.