lbumentations: fast and flexible image augmentations

1

Vladimir Iglovikov

Sr. Computer Vision Engineer at Lyft

Ph.D. In Physics

Kaggle GrandMaster

Computer Vision Tasks

Input

Output

Need for the data

Problems with the labeled data

  • Expensive to collect
  • Hard to label
  • Legal or privacy problems

Augmentations

Augmentations

  1. Synthetically increase dataset.

Augmentations

  1. Synthetically increase dataset.
  2. Make network invariant to transforms (rotations, brightness, flips, etc).

Augmentations

  1. Synthetically increase dataset.
  2. Make network invariant to transforms (rotations, brightness, flips, etc).
  3. Work as a regularizer.

Augmentations

Augmentations vs Heavier architecture

lbumentations

  • Used in almost all winning solutions to CV problems at Kaggle.

  • Used in academia: 32+ citations

  • Used in industry

Who works on Albumentations?

Alexander Buslaev

Mapbox, Belarus

Kaggle Master

Vladimir Iglovikov

Lyft, USA

Kaggle Grandmaster

Alex Parinov

X5, Russia

Kaggle Master

Eugene Khvedchenia

Jumio, Ukraine

Kaggle Grandmaster

Mikhail Druzhinin

Simicon, Russia

and many contributors at Github

CPU becomes bottleneck => we want fast transforms

 

We want different targets to be transformed in sync

Original

Transformed

import albumentation as A

strong = albu.Compose([
    A.HorizontalFlip(p=0.5),
    A.RandomSizedCrop((700, 900), 600, 600),
    A.RGBShift(p=0.35),
    A.Blur(blur_limit=11, p=0.31415),
    A.RandomBrightness(p=0.2019),
    A.CLAHE(p=0.666),
], p=0.1984)
  1. image
  2. bounding boxes
  3. mask
  4. keypoints

 

Pixel-level transforms

38+ transforms

Original

RandomRain

RandomSnow

RandomFog

Spatial Tranforms

Support masks, bounding boxes, keypoints

33+ transforms

Key points

Multiple targets

We want to apply the same transform to:

  • N images
  • M masks
  • K bounding boxes
  • L keypoints

simultaneously.

import albumentations as A

aug = A.Compose(transformations, 
		p=0.42, 
		additional_targets={
    'image1': 'image',    
    ...
    'imageN': 'image',
    'bboxes1': 'bboxes',    
    ...
    'bboxesM': 'bboxes',
    'keypoints1': 'keypoints',    
    ...
    'keypointsK': 'keypoints',
    'mask1': 'mask',    
    ...
    'maskL': 'mask'
    })
    
    
    transformed = aug(image=img1, image2=img1, ....)
transform = A.Compose([
    A.RandomCrop(768, 768),
    A.OneOf([
        A.RGBShift(), 
        A.HueSaturationValue()
    ]),
])
# to yaml
A.save(transform, '/tmp/transform.yml', data_format='yaml')

# to json
A.save(transform, '/tmp/transform.json')

# to python dictionary
transform_dict = A.to_dict(transform)

Serialization / Deserialization

Serialization / Deserialization

transform = A.Compose([
    A.RandomCrop(768, 768),
    A.OneOf([
        A.RGBShift(), 
        A.HueSaturationValue()
    ]),
])
# to yaml
A.save(transform, 'transform.yml', data_format='yaml')

# to json
A.save(transform, 'transform.json')

# to python dictionary
transform_dict = A.to_dict(transform)
# Load from yaml
loaded_transform = A.load('transform.yml', data_format='yaml')

# Load from json
loaded_transform = A.load('transform.json')

# Load from dictionary
loaded_transform = A.from_dict(transform_dict)

Documentation

Deep Learning Competitions

Topcoder Urban3D Challenge

Image

Mask

 

Instance Segmentation

1st place

Alexander Buslaev

Augmentations:

OpticalDistortion, Grid, Flip, ShiftScaleRotate, Transpose, HueSaturationValue

Topcoder: Spacenet 3 challenge

Semantic Segmentation

1st place

Alexander Buslaev

Augmentations:

Flip, ShiftScaleRotate, Transpose, HueSaturationValue

Kaggle: Carvana Image Masking Challenge

Semantic Segmentation

1st place

Vladimir Iglovikov, Alexander Buslaev,

Artem Sanakoyeu

Augmentations:

HorizontalFlip, ShiftScaleRotate, RandomBrightness, HueSaturationValue

Input

Output

Kaggle: Data Science Bowl 2018

Instance Segmentation

1st place

Alexander Buslaev,

Selim Seferbekov

Viktor Durnov

Augmentations:

CLAHE, IAASharpen, IAAEmboss, IAAAdditiveGaussianNoise, ToGray, InvertImg, RandomRotate90, Flip, Transpose, MotionBlur, MedianBlur, Blur, RandomContrast, RandomBrightness, ShiftScaleRotate, OpticalDistortion, GridDistortion, ElasticTransform, IAAPerspective, IAAPiecewiseAffine, HueSaturationValue, ChannelShuffle

Input

Output

Kaggle: Inclusive Images

Multilabel classification

1st place

Pavel Ostyakov

Augmentations:

RandomRotate90, Flip, Transpose, GaussNoise, MedianBlur, ShiftScaleRotate, RandomBrightness, HueSaturationValue.

Adapting Convolutional Neural Networks for Geographical Domain Shift, https://arxiv.org/abs/1901.06345

Kaggle: TGS Salt Identification Challenge

Multilabel classification

1st place

Eugene Babakhin,

phalanx

Augmentations:

HorizontalFlip, RandomBrightness, RandomContrast, ShiftScaleRotate.

 

Kaggle: SIIM-ACR Pneumothorax Segmentation

Semantic Segmentation

1st place

 

Aimoldin Anuar

Augmentations:

HorizontalFlip, RandomContrast, RandomGamma, RandomBrightness, ElasticTransform, GridDistortion, OpticalDistortion, ShiftScaleRotate

 

Kaggle: iMaterialist (Fashion) CVPR 2019: FGVC6

Instance Segmentation

1st place

Miras Amir

 

 

Augmentations:

HorizontalFlip, CutOut, RandomBrightnessContrast, JpegCompression

 

Input

Output

Future plans

  • AutoAugment
  • Transforms on GPU
  • 3D images and LIdar point clouds
  • Extra tools for tuning augmentations

Contributors

Thank you!

Made with Slides.com