Using Deep Learning for Satellite Imagery Feature Detection Challenge

Vladimir & Sergey. 3rd out of 419 teams.

$20,000 prize

Sr. Data Scientist at TrueAccord
PhD in Physics at UC Davis
San Francisco, USA
Kaggle Master

Deep Learning Engineer
BS in Computer Science
Angarsk. Siberia. Russia

Industry => interpretability, scalability, size, throughput
Academia => novelty
Competitions => accuracy

Problem description

Input

Satellite images in

RGB + P (450-690 nm) 0.31 m / pixel
M band (400-1040 nm) 1.24 m / pixel
A band (1195-2365 nm) 7.5 m / pixel

Output

Classes each pixel belongs to

Buildings
Misc. man-made structures
Roads
Track
Trees
Crops
Waterways
Standing water
Vehicle (Large)
Vehicle (Small)

Train set: 25 images
Test set: 32 images
- 6 Public
- 26 Private
Each image 1km x 1km

Bands

RGB

M-band

A-band

Problem description

Evaluation metric

Issues with the data

Small amount of train data (25 km^2)
Data is very diverse (towns, jungles, farms)
Mistakes in labeling
Big class imbalance
Different distributions in train / public test / private test

Issues with the data

The same car

Cars + trash cans marked as cars

Water classes: sometimes simple solutions work better

NDWI = \frac {Green - NIR} {Green + NIR}

NDWI = \frac {Green - NIR} {Green + NIR}

Text

Indices

NDWI: Normalized difference water index
EVI: Enhanced vegetation index
SAVI: Soil-Adjusted Vegetation Index
CCCI: Canopy Chlorophyl Content Index

Neural network pipeline

Buildings
Misc. man-made structures
Roads

Track
Trees
Crops

Network architecture

Local boundary effects

Problem

Prediction quality decreases at the edges

Solution

Added Cropping2D layer

Global boundary effects

Problem

We need integer number of tiles =>

Problem

Zero Padding creates artifacts =>

Solution

Zero Padding

Solution

Reflection Padding

Test time augmentation

Results

Water classes => unsupervised
Car classes => did not predict
Other classes => U-net per class
Input 4 indices + M-band + P-band + RGB
Test time augmentation

Summary

Hardware

Vladimir

Core i7

RAM 32 Gb

Titan X (Pascal)

Sergey

2x Xeon E5-2670

RAM 128 GB

GTX 1080

Ongoing Deep Learning competitions

Intel & MobileODT Cervical Cancer Screening (June 21, $100,000)
NOAA Fisheries Steller Sea Lion Population Count (June 27, $25,000)
ImageNet 2017. (Jun 30)
Planet: Understanding the Amazon from Space (July 20, $60,000)

Deep Learning for Satellite Imagery Feature Detection

Using Deep Learning for Satellite Imagery Feature Detection Challenge

More from Vladimir Iglovikov