lbumentations: fast and flexible image augmentations
Computer Vision Tasks
Input
Output
Need for the data
Problems with the labeled data
- Expensive to collect
- Hard to label
- Legal or privacy problems
Augmentations
Augmentations
- Synthetically increase dataset.
Augmentations
- Synthetically increase dataset.
- Make network invariant to transforms (rotations, brightness, flips, etc).
Augmentations
- Synthetically increase dataset.
- Make network invariant to transforms (rotations, brightness, flips, etc).
- Work as a regularizer.
Augmentations
Augmentations vs Heavier architecture
lbumentations
-
Used in almost all winning solutions to CV problems at Kaggle.
-
Used in academia: 32+ citations
-
Used in industry
Who works on Albumentations?
Alexander Buslaev
Mapbox, Belarus
Kaggle Master
Vladimir Iglovikov
Lyft, USA
Kaggle Grandmaster
Alex Parinov
X5, Russia
Kaggle Master
Eugene Khvedchenia
Jumio, Ukraine
Kaggle Grandmaster
Mikhail Druzhinin
Simicon, Russia
and many contributors at Github
CPU becomes bottleneck => we want fast transforms
We want different targets to be transformed in sync
Original
Transformed
import albumentation as A
strong = albu.Compose([
A.HorizontalFlip(p=0.5),
A.RandomSizedCrop((700, 900), 600, 600),
A.RGBShift(p=0.35),
A.Blur(blur_limit=11, p=0.31415),
A.RandomBrightness(p=0.2019),
A.CLAHE(p=0.666),
], p=0.1984)
- image
- bounding boxes
- mask
- keypoints
Pixel-level transforms
38+ transforms
Original
RandomRain
RandomSnow
RandomFog
Spatial Tranforms
Support masks, bounding boxes, keypoints
33+ transforms
Key points
Multiple targets
We want to apply the same transform to:
- N images
- M masks
- K bounding boxes
- L keypoints
simultaneously.
import albumentations as A
aug = A.Compose(transformations,
p=0.42,
additional_targets={
'image1': 'image',
...
'imageN': 'image',
'bboxes1': 'bboxes',
...
'bboxesM': 'bboxes',
'keypoints1': 'keypoints',
...
'keypointsK': 'keypoints',
'mask1': 'mask',
...
'maskL': 'mask'
})
transformed = aug(image=img1, image2=img1, ....)
transform = A.Compose([
A.RandomCrop(768, 768),
A.OneOf([
A.RGBShift(),
A.HueSaturationValue()
]),
])
# to yaml
A.save(transform, '/tmp/transform.yml', data_format='yaml')
# to json
A.save(transform, '/tmp/transform.json')
# to python dictionary
transform_dict = A.to_dict(transform)
Serialization / Deserialization
Serialization / Deserialization
transform = A.Compose([
A.RandomCrop(768, 768),
A.OneOf([
A.RGBShift(),
A.HueSaturationValue()
]),
])
# to yaml
A.save(transform, 'transform.yml', data_format='yaml')
# to json
A.save(transform, 'transform.json')
# to python dictionary
transform_dict = A.to_dict(transform)
# Load from yaml
loaded_transform = A.load('transform.yml', data_format='yaml')
# Load from json
loaded_transform = A.load('transform.json')
# Load from dictionary
loaded_transform = A.from_dict(transform_dict)
Documentation
Deep Learning Competitions
Topcoder Urban3D Challenge
Image
Mask
Instance Segmentation
1st place
Alexander Buslaev
Augmentations:
OpticalDistortion, Grid, Flip, ShiftScaleRotate, Transpose, HueSaturationValue
Topcoder: Spacenet 3 challenge
Semantic Segmentation
1st place
Alexander Buslaev
Augmentations:
Flip, ShiftScaleRotate, Transpose, HueSaturationValue
Kaggle: Carvana Image Masking Challenge
Semantic Segmentation
1st place
Vladimir Iglovikov, Alexander Buslaev,
Artem Sanakoyeu
Augmentations:
HorizontalFlip, ShiftScaleRotate, RandomBrightness, HueSaturationValue
Input
Output
Kaggle: Data Science Bowl 2018
Instance Segmentation
1st place
Alexander Buslaev,
Selim Seferbekov
Viktor Durnov
Augmentations:
CLAHE, IAASharpen, IAAEmboss, IAAAdditiveGaussianNoise, ToGray, InvertImg, RandomRotate90, Flip, Transpose, MotionBlur, MedianBlur, Blur, RandomContrast, RandomBrightness, ShiftScaleRotate, OpticalDistortion, GridDistortion, ElasticTransform, IAAPerspective, IAAPiecewiseAffine, HueSaturationValue, ChannelShuffle
Input
Output
Kaggle: Inclusive Images
Multilabel classification
1st place
Pavel Ostyakov
Augmentations:
RandomRotate90, Flip, Transpose, GaussNoise, MedianBlur, ShiftScaleRotate, RandomBrightness, HueSaturationValue.
Adapting Convolutional Neural Networks for Geographical Domain Shift, https://arxiv.org/abs/1901.06345
Kaggle: TGS Salt Identification Challenge
Multilabel classification
1st place
Eugene Babakhin,
phalanx
Augmentations:
HorizontalFlip, RandomBrightness, RandomContrast, ShiftScaleRotate.
Kaggle: SIIM-ACR Pneumothorax Segmentation
Semantic Segmentation
1st place
Aimoldin Anuar
Augmentations:
HorizontalFlip, RandomContrast, RandomGamma, RandomBrightness, ElasticTransform, GridDistortion, OpticalDistortion, ShiftScaleRotate
Kaggle: iMaterialist (Fashion) CVPR 2019: FGVC6
Instance Segmentation
1st place
Miras Amir
Augmentations:
HorizontalFlip, CutOut, RandomBrightnessContrast, JpegCompression
Input
Output
Future plans
- AutoAugment
- Transforms on GPU
- 3D images and LIdar point clouds
- Extra tools for tuning augmentations
Contributors
Thank you!
LinkedIn: https://www.linkedin.com/in/iglovikov/
Twitter: https://twitter.com/viglovikov
Albumentations for Adobe
By Vladimir Iglovikov
Albumentations for Adobe
Presentation for Deep Learning Tech Talks/Idea Pitching for Lets Stop Wildfires Hackathon (https://www.meetup.com/AI-for-Mankind/events/265983290/)
- 1,969