Dstl Safe Passage: Detecting and Classifying Vehicles in Aerial Imagery

Vladimir Iglovikov

Physics, PhD

Kaggle Master

Historical overview

December 2016 - March 2017

Kaggle: Dstl Satellite Imagery Feature Detection

Roman Solovyov, Artur Kuzin 2nd place ($30,000)

Vladimir Iglovikov, Sergey Mushinskiy 3rd place ($20,000)

blog posts (rus, eng)
meetup talks (rus, eng)
paper (next week)

Organizers spent $465,000 and got state of the art solutions that they can not use.

Historical overview

March 2017

Press release: Dstl’s Kaggle competition has been a great success
DSTL pays BAE Systems to create their own Kaggle: https://www.datasciencechallenge.org and start two competitions (Computer Vision and Natural Language Processing)
Problems are pretty good, but rules of the competitions are discriminatory (Everyone can participate, but only limited set of people can claim prize money)
We got verbal and written promise from organizers that rules will be changed.

Problem Statement

RGB satelite images
2000x2000
5cm / pixel
600 train
600 test
9 classes

Problem Statement: class distribution

Figure by Vladislav Kassym

Problem Statement

train: 600 images
test: 600 images
2000x2000
5 cm / pixel

One quarter of one image

Evaluation Metric

Jaccard = \frac {TP} {TP + FN + FP}

Jaccard = \frac {TP} {TP + FN + FP}

Class	Radius
motorcycle	12 pixels (60 cm)
cars	30 pixels (150 cm)
van	40 pixels (200 cm)
bus	45 pixels (225 cm)

Motivation

Why participate?

Very clean balanced dataset.
Knowledge in Image Detection.
Good amount of data. (Not too much, not too little.)
No data leaks.
Codebase will be reused in:
- Kaggle: Cervix
- Kaggle: Seals
- ImageNet 2017

Why not participate?

No way to claim prize money.
No community.
Unknown platform. (Hard to sell results.)

Step 1: bounding boxes

Before

After

~ 10 hours

What network architecture to use?

Speed/accuracy trade-offs for modern convolutional object detectors

arXiv:1611.10012

What network architecture to use?

Faster RCNN

Slow to train
Slow to predict
Accurate in general
Accurate on small objects

SSD

Fast to train
Fast to predict
Less accurate in general
Pretty bad with small objects

=>

For this task winner: Faster RCNN

Faster RCNN

What framework to use?

Keras + TensorFlow

Existing Faster RCNN implementation
Familiar code base
Good documentation
Slow
Pain to parallelize

MXNet

Existing Faster RCNN implementation
Unfamiliar code base
OK documentation
Fast
Zero pain with parallelization

=>
For this task winner: MXNET

Solution

Train

Faster RCNN + VGG16 base
random crops 1000x1000
D4 group augmentation

8 samples/sec

Test

overlapping tiles
D4 group augmentation
Non-Maximum Suppression

20 samples/sec

Code - example from MXNet repository

Sources of mistakes: close- packed objects

Sources of mistakes: trains like buses

Sources of mistakes: debris as cars

Main source of mistakes: misclassification

gray car in the shade <=> black car

gray car in the sun <=> white car

blue car in the shade <=> black car

white hatchback <=> white van

hatchback <=> sedan

=>

inconsistent labeling

low predictive power

Summary

Centers of cars => bounding boxes (manually)
Faster RCNN + VGG16, MXnet
D4 group train and test time augmentation

Hardware

Intel i7
32Gb RAM
2 x Titan X (Pascal)

Many thanks to:

Sergey Mushinskiy
Vladislav Kassym
Sergey Belousov

Dstl Safe Passage: Detecting and Classifying Vehicles in Aerial Imagery

By Vladimir Iglovikov

Dstl Safe Passage: Detecting and Classifying Vehicles in Aerial Imagery

1,685

Vladimir Iglovikov

viglovikov