AMAT: Medial Axis Transform for Natural Images

Stavros Tsogkas

Sven Dickinson

Outline

  1. Medial axis transform for binary shapes

  2. Previous work on medial point detection

  3. Extending MAT to natural images

  4. Results and future work

Medial Axis Transform (MAT)

MAT

A transformation for extracting new descriptors of shape, H. Blum, Models for the perception of speech and visual form, 1967

Early 2D approaches

Brady and Asada (1986)

Rom and Medioni (1993)

Zhu and Yuille (1996)

Siddiqi, Shokoufandeh, Dickinson, and Zucker (1998)

Bai, Latecki, and Liu (2007)

Macrini, Dickinson, Fleet, and Siddiqi (2011)

Shape matching and recognition

Recognition of shapes by editing shock graphs, Sebastian et al., ICCV 2001

Branches correspond to object parts

Shape simplification

Q-MAT: Computing medial axis transform by quadratic error minimization, Li et al.,  Transactions on Graphics, 2015

QMAT demo

Q-MAT: Computing medial axis transform by quadratic error minimization, Li et al.,  Transactions on Graphics, 2015

Shape manipulation

Medial-axis-driven shape deformation with volume preservation,

Lan et al., The Visual Computer, 2017

Shape manipulation

Medial-axis-driven shape deformation with volume preservation,

Lan et al., The Visual Computer, 2017

Why not 2D colour images then?

  • Lack of generalized definition.

  • Hard to obtain annotations for learning.

  • Foreground/background axes?

  • Symmetry at multiple scales.

MAT for natural images is not obvious

Superpixels as deformable maximal disks

Multiscale Symmetric Part Detection and Grouping,

A. Levinshtein, C. Sminchisescu, and S. Dickinson, ICCV, 2009

Medial point detection

Image from BSDS300

Ground-truth segmentation

Ground-truth skeleton

A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics, D. Martin, C. Fowlkes, J.Malik, ICCV 2001.

Ideas from boundary detection

Learning to detect natural image boundaries using local brightness, color, and texture cues, D. Martin, C. Fowlkes, J. Malik, TPAMI 2004

From edges to symmetry axes

Learning-based symmetry detection for natural images, S. Tsogkas, I. Kokkinos, ECCV 2012.

Multiple scales and orientations

Orientation

Scale

Symmetry probability

NMS

Object skeleton detection

  • Only foreground objects considered.

  • Images are iconic (centered objects).

Object skeleton extraction in natural images by fusing scale-associated deep side outputs, Shen et al, CVPR 2015

The good...

  • No need for closed boundaries or object masks.

  • Tackle challenging cases (e.g. curved contours).

The bad...

  • No scale information.

  • Isolated responses.

  • Grouping is a challenging problem.

MAT should be invertible

MAT^{-1}
MAT^{-1}

Maximally inscribed disks reveal scale

(x_i,y_i)
r_i

Generative definition of medial disks

\mathbf{p}_i
r_i
\mathbf{p}_j
r_j
  • fsummarizes patch (encoding)
  • g: reconstructs patch (decoding)
f
g
D^I_{\mathbf{p}_i,r_i}
D^I_{\mathbf{p}_j,r_j}
\tilde{D}^I_{\mathbf{p}_j,r_j}
\tilde{D}^I_{\mathbf{p}_i,r_i}
[\bar{R}_i,\bar{G}_i,\bar{B}_i]
[\bar{R}_j,\bar{G}_j,\bar{B}_j]

Maximal disks have low reconstruction error

,
e(
)\approx 0
,
e(
) \gg 0
D^I_{\mathbf{p}_i,r_i}
\tilde{D}^I_{\mathbf{p}_i,r_i}
D^I_{\mathbf{p}_j,r_j}
\tilde{D}^I_{\mathbf{p}_j,r_j}

Redundancies in image statistics

Superpixel: Locally uniform appearance

SLIC superpixels, Achanta et al., TPAMI 2012

AppearanceMAT definition

...

for all p,r

\min_{\mathbf{p},r} \sum_{i=1}^m e_{\mathbf{p}_i,r_i}
g \circ f \circ D
I=\bigcup_{i=1}^m D^I_{\mathbf{p}_i,r_i}

AppearanceMAT definition

2.\, e_{\mathbf{p},r} = || \tilde{D}^I_{\mathbf{p},r} - D^I_{\mathbf{p},r} ||^2 \ge 0
3.\quad MAT=\{ (\mathbf{p}_1,r_1,\mathbf{f}_1),\ldots,(\mathbf{p}_m,r_m,\mathbf{f}_m) \}:\min_{\mathbf{p},r} \sum_{i=1}^m e_{\mathbf{p}_i,r_i}
A
1.\, \tilde{D}^I_{\mathbf{p},r} = g\circ f\circ D^I_{\mathbf{p},r},\, \forall \mathbf{p},r
4.\, I=\bigcup_{i=1}^m D^I_{\mathbf{p}_i,r_i}

A for "appearance"

A trivial solution

Select pixels as medial points (disks of radius 1).

Perfect reconstruction quality!

Not very useful in practice...

Goal: balance between sparsity and reconstruction

  • Dense representation

  • Low reconstruction error

  • Sparse representation

  • High reconstruction error

Favor the selection of larger disks...

Increasing \( w \)

Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + \orange{w}(\frac{1}{r}) \).

...as long as they do not incur a high reconstruction error

Sparsity-quality trade-off

Increasing w

  • Select all pixels as medial points (disks of radius 1).

    • Perfect reconstruction quality.

    • Not useful practically.

  • Add regularization term to disk cost: \( c_{\mathbf{p},r} = e_{\mathbf{p},r} + w(\frac{1}{r}) \).

AMAT is a weighted geometric set cover problem

WGSC is NP-hard!

PTAS exist

\Bigg\{
\Bigg\{
\Bigg\{

Set we want to cover

Covering elements (range)

Set costs

Cover 2D image

using disks of radii {1,...,R}

with costs \( c_{\mathbf{p},r}\)

AMAT is a weighted geometric set cover problem

Cover 2D image using disks of radii {1,...,R}, with costs \( c_{\mathbf{p},r}\)

WGSC is NP-hard!

PTAS exist

\Bigg\{
\Bigg\{
\Bigg\{

Set we want to cover

Covering elements (range)

Set costs

Heuristic cost function

Greedy algorithm

  1. Compute all costs \( c_{\mathbf{p},r} \).

  2. While image has not been completely covered:

    • Select disk \( D_{\mathbf{p^*},r^*} \) with lowest cost.    

    • Add point \( (\mathbf{p^*},r^*,\mathbf{f}_{\mathbf{p^*},r^*}) \) to the solution.

    • Mark disk pixels as covered.

    • Update costs \( c_{\mathbf{p},r} \)

 

Approximation algorithms, Vijay V. Vazirani

AMAT Demo

AMAT Demo

Texture makes the problem harder

Image smoothing via L0-gradient minimization, Xu et al., SIGGRAPH 2011

Grouping points together...

  • space proximity
  • smooth scale variation
  • color similarity

color similarity

Input

AMAT

Groups

(color coded)

...opens up possibilities

Thinning

Segmentation

  • Object proposals

  • and more...

BMAX500 annotations

Image from BSDS500

Ground-truth segmentation

BMAX500

SYMMAX300

Extract skeletons of all segments in the ground-truth

Qualitative results

Input

AMAT

Groups

Ground-truth

Quantitative results (BMAX500)

Medial point detection Precision Recall F-measure
MIL 0.49 0.55 0.52
AMAT 0.52 0.63 0.57
Human 0.89 0.66 0.77
Reconstruction
MSE PSNR (dB) SSIM Compression
MIL 0.0258 16.6 0.53 20x
GT-seg 0.0149 18.87 0.64 9x
GT-skel 0.0114 20.19 0.67 14x
AMAT 0.0058 22.74 0.74 11x

Reconstruction results

Input

MIL

GT-seg

GT-skel

AMAT

More reconstruction results

Input

MIL

GT-seg

GT-skel

AMAT

Summary

  • Generalization of MAT for natural images.

  • Beyond medial point detection:

    • Scale + appearance information.

    • Group points into connected skeletal components.

  • Completely unsupervised.

  • Balance between compactness and reconstruction.

  • Use ~10% points of the input image.

Applications

Painterly rendering

Interactive segmentation

Constrained image editing

Limitations and future work

  • Better texture reconstruction.

  • More powerful encoding and decoding functions.

  • Parameterize relative roles of shape and appearance.

    • Flexible point grouping.

    • Segmentations at different granularities.

  • Parallelize greedy algorithm.

Links:

Detecting symmetry in the wild

2D symmetry

3D symmetry

Skeletons -

medial axes

Workshop in conjunction with:

AMAT

By Stavros Tsogkas

AMAT

Slides for the paper AMAT: Medial Axis Transform for Natural Images

  • 2,930