X-Ray image segmentation

STFC/Durham University CDT in Data Intensive Science.

Carolina Cuesta-Lázaro

Arnau Quera-Bofarull

(Joseph Bullock)

Placement at IBEX Innovations Ltd.

Who are we?

2 months PhD placement at IBEX innovations, as part of the CDT program

 

Carolina / Arnau

Cosmology

Joe

Particle Physics

Detect bone and soft-tissue on X-Ray images

Detect

collimator

 

Segment

 

Open beam

Bone

Soft-tissue

Previous approaches

  • Challenging problem due to varying brightness throughout the image.
  • Usually done by detecting edges and shapes.
  • High accuracy requires tuning of hyper parameters per image and body-part (not automated).
  • Not well defined boundaries.

 Kazeminia, S., Karimi, et al (2015)

 

Are there better features hidden in the data?

Extracts features from a high dimensional feature space, once trained on a particular dataset.

DEEP LEARNING

Deep Learning and image segmentation

http://cs229.stanford.edu/proj2017/final-reports/5241462.pdf

Švihlík et al (2014)

http://cs.swansea.ac.uk/~csadeline/projects_astro.html

Badrinarayanan et al (2015)

Challenges for segmentation with deep learning

Semantics (what)  

MaxPooling

Translation Invariance

Location (where)

First attempt to segment

Receptive field grows very slowly !

Not tractable to learn long range correlations

Need pooling, but how to unpool?

Simple upsampling

1 0
2 3
1 1 0 0
1 1 0 0
2 2 3 3
2 2 3 3

Upsample

1 0
2 3
1 0 0 0
0.6 0.5 0 0
0.3 1 2 3
1.5 2 1.4 0.3

Use same indices that were pooled

Other layers

...

2 1
0.5 1.4
2 0 0 1
0 0 0 0
0 0 0 1.4
0 0.5 0 0

Transpose convolution

1 2
1 0
1 0 2
2 1 1
1 0 0

Learned filter

1 1 0 0
0 0 0 0
0 0 0 0
0 0 0 0

Stride 2 : Filter moves 2 pixels in the output for every pixel in the input

Feature map

1 1+4 2 2
0 0+2 0 0
0 0 0 0
0 0 0 0

Sum where overlap!

SegNet

  • First Encoder - Decoder architecture for segmentation.
  • Undo MaxPooling.
  • The network has more than 15 Million free parameters.

Badrinarayanan et al (2015)

U-Net

Skip connections to combine

information at different scales

Deconvolutions to

upsample

Up-weight the loss function

at the boundaries

Trained on 30 images for cell segmentation !

Ronneberger et al (2015)

Skipping connections

Increases convexity and smoothness of loss function

DeepLab

Pyramid of

Dilated Convolutions

Fast

Chen et al (2018)

Dilated Convolutions

Chen et al (2018)

The Deep Learning Lego

Upsampling

Duplicate

Indices unpooling

Deconvolution

Interpolated

Preserve detail

Encoder - Decoder

Dilated Convolutions

Pyramid of convolutions

Architecture

Skip Connections

Asymmetric

Image credit : www.grabcad.com

The Road to XNet

Coursera

Dataset

Small (compared to typical DL datasets) and unbalanced.

Dataset

How can we do Deep Learning with ~150 images?

  1. Transfer Learning: Pre-trained network on different dataset.
    Problems:
    1. Not trivial to choose which layers to fine tune.
    2. Trained networks deal with very different data.
  2. Data Augmentation: Add artificial data generated through transformations of the original dataset.
    Problems:
    1. Can add unwanted artifacts.

Data Augmentation

We generate new images by applying combinations of:

  1. Translations
  2. Rotations
  3. Shear transformations
  4. Cropping
  5. Elastic transformations

Splitting the dataset

We typically divide our dataset into three subsets:

  1. Training: 70% from categories with more than one sample.
    Data augmentation -> Equal sample size for all categories.
    Final size ~ 7000 images.
  2. Validation: 15%  from categories with more than one sample. Used to stop training and hyperparameter tuning.
  3. Test: 15%  including categories with only one sample. Final network performance is evaluated in this set.

Network Architecture

First attempts focused on a very simplified SegNet model.

Underfitting

Network Architecture

Going deeper has limits ( limited image size, GPU memory bottleneck, overfitting).

Overfitting

XNet

  1. Typical encoder-decoder architecture.
  2. W-shape for two feature extraction stages. Avoids resolution problems.
  3. Skip connections across levels.
  4. L2 regularisation at each convolutional layer.

Preventing overfitting

L2 regularisation

$$ L := - \sum_{i=1}^3 \color{royalblue}y_i \color{cadetblue}\log (f(x)_i) \color{black}+ \color{orange} \lambda \color{black}\sum_j \color{Maroon}\theta_j^2$$

Loss function = Cross-entropy + Regularisation

Regularisation hyperparameter $$\lambda = 5 \cdot 10^{-4}$$

Network parameters

Final model: XNet

Testing XNet

Danger!  K-fold cross-validation needed.

Testing XNet

Metrics

Metrics F1-Score AUC Accuracy Confidence
Weighted Average 0.92 0.98 92% 97%

Calibration?

TP Rate

FP Rate

True Label

Predicted Label

Calibration

Confidence vs Accuracy

  • Moden CNNs often ill-calibrated (20% gap) (see Guo et al. (2017)).
  • Reliable probability estimates important for post-processing.
  • Network is a bit overconfident in the bone region, but overall acceptable (5% gap).

$$\text{Confidence}(X) = \frac{1}{| X|} \sum_{i \in X}p_i$$

Good calibration: Confidence ~ Accuracy

% of samples

Probability

Post-processing

Reducing soft tissue false positives.

The network outputs 3 probability maps.

 

Soft tissue probability map

Apply threshold to probability map -> impose confidence level

Probability

Comparison with other methods

  • Smooth connected boundaries.
  • Better generalisation to different body parts (we do not have any frontal view of a foot in our dataset).
  • More robust to noise.
  • Well defined metrics to benchmark against.

 Kazeminia, S., Karimi, et al (2015)

 

  • The development process takes a long time due to hyperparameter tunning (50% of our internship time).
  • We used ~1000 GPU hours.
    • 3x 4GB GPUS
    • 1x 8GB GPU
    • AWS 12GB GPU

Development process

Cryptocurrency Times

  • Promising ML applications to medical imaging.
  • Possible to train DL models with limited hardware and resources.
  • Paper out arXiv:1812.00548v1, and presented at the SPIE Medical Imaging conference in San Diego. Best student paper awarded.
  • Academia-industry collaboration exciting and rewarding for students.

Conclusions

X-Ray image segmentation

By arnauqb

X-Ray image segmentation

  • 766