Deep Learning for Satellite Imagery Feature Detection
Vladimir Iglovikov
Sr. Data Scientist at TrueAccord
Vladimir & Sergey. 3rd out of 419 teams.
$20,000 prize
- Sr. Data Scientist at TrueAccord
- PhD in Physics at UC Davis
- San Francisco, USA
- Kaggle Master
- Deep Learning Engineer
- BS in Computer Science
- Angarsk. Siberia. Russia
- Industry => interpretability, scalability, size, throughput
- Academia => novelty
- Competitions => accuracy
Problem description
Input
Satellite images in
- RGB + P (450-690 nm) 0.31 m / pixel
- M band (400-1040 nm) 1.24 m / pixel
- A band (1195-2365 nm) 7.5 m / pixel
Output
Classes each pixel belongs to
-
Buildings
-
Misc. man-made structures
-
Roads
-
Track
-
Trees
-
Crops
-
Waterways
-
Standing water
-
Vehicle (Large)
-
Vehicle (Small)
- Train set: 25 images
-
Test set: 32 images
- 6 Public
- 26 Private
- Each image 1km x 1km
Bands
RGB
M-band
A-band
Problem description
Evaluation metric
Issues with the data
- Small amount of train data (25 km^2)
- Data is very diverse (towns, jungles, farms)
- Mistakes in labeling
- Big class imbalance
- Different distributions in train / public test / private test
Issues with the data
The same car
Cars + trash cans marked as cars
Water classes: sometimes simple solutions work better
Text
Indices
- NDWI: Normalized difference water index
- EVI: Enhanced vegetation index
- SAVI: Soil-Adjusted Vegetation Index
- CCCI: Canopy Chlorophyl Content Index
Neural network pipeline
-
Buildings
-
Misc. man-made structures
-
Roads
-
Track
-
Trees
-
Crops
Network architecture
Local boundary effects
Local boundary effects
Problem
Prediction quality decreases at the edges
Solution
Added Cropping2D layer
Global boundary effects
Problem
We need integer number of tiles =>
Problem
Zero Padding creates artifacts =>
Solution
Zero Padding
Solution
Reflection Padding
Test time augmentation
Results
- Water classes => unsupervised
- Car classes => did not predict
- Other classes => U-net per class
- Input 4 indices + M-band + P-band + RGB
- Test time augmentation
Summary
Hardware
Vladimir
Core i7
RAM 32 Gb
Titan X (Pascal)
Sergey
2x Xeon E5-2670
RAM 128 GB
GTX 1080
Ongoing Deep Learning competitions
-
Intel & MobileODT Cervical Cancer Screening (June 21, $100,000)
-
NOAA Fisheries Steller Sea Lion Population Count (June 27, $25,000)
- ImageNet 2017. (Jun 30)
-
Planet: Understanding the Amazon from Space (July 20, $60,000)
Using Deep Learning for Satellite Imagery Feature Detection Challenge
By Vladimir Iglovikov
Using Deep Learning for Satellite Imagery Feature Detection Challenge
- 3,314