Siamese Neural Networks for One-shot Image Recognition

ICML'15

Gregory Koch - Richard Zemel - Ruslan Salakhutdinov

University of Toronto

Café de la Recherche #6 @Sicara

26 juillet 2018 - Antoine Toubhans

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

One-shot learning

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

The Omniglot dataset

kind of internationalized MNIST
50 alphabets (40 "background" + 10 "evaluation)
- Well-established ones, like Latin, Korean, Sanskrit
- Lesser known local alphabet
- Fictitious alphabet e.g., Klingon
range from 15 up to 40 characters
for each alphabet: 20 drawers produced each characters once
collected by via Amazon’s Mechanical Turk

The Omniglot dataset - One-Shot learning

Learn from the 40 "background" alphabets

The Omniglot dataset - One-Shot learning

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

Naive approach: Nearest Neighbor

\mathbb{p}(x) = \text{argmin}_{c\in C}\vert\vert x - x_c \vert\vert

\mathbb{p}(x) = \text{argmin}_{c\in C}\vert\vert x - x_c \vert\vert

Accuracy of 26.5 % !

(Random ~5%)

Variational Bayesian framework (Li Fei-Fei and al. early's 2000)

First work around One-Shot learning

Hierarchical Bayesian Program Learning (Lake and al. 2013)

Best achieved accuracy: 95.2 %, close to Humans (95.5%)
involve learning a generative model for strokes
- specific to the Omniglot dataset

The deep approach

Boltzmann Machines (Lecun and Al. 2005)

Accuracy of 62 %
use contrastive energy function
completely unsupervised

Classical CNN classifier:

Strong overfit the "background" alphabet
- only 20 images per class
Nearest neighbor on the last layer
Hope the learned features would be usefull to classify an unknown alphabet

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

Decompose the problem in two task:

Verification task
- train a model to discriminate between the class-identity of image pairs
- "probability" to be in the same class:
One-Shot Classification task
- a single exemple of each class
- use the verification model to predict the class of an unknown alphabet

The Siamese Network approach

\mathbb{p}_{\text{classif}}(x) = \text{argmax}_{c\in C} \; \mathbb{p}_{\text{verif}}(x, x_c)

\mathbb{p}_{\text{classif}}(x) = \text{argmax}_{c\in C} \; \mathbb{p}_{\text{verif}}(x, x_c)

\{x_c\}_{c\in C}

\{x_c\}_{c\in C}

\mathbb{p}_{\text{verif}}(x_1, x_2)

\mathbb{p}_{\text{verif}}(x_1, x_2)

Single Layer Siamese Net

Trainning the verifier

Split the "background" dataset
- train on 30 alphabets and 12 drawers
Randomly sample same and different pair of characters
- dataset #1: 30K pairs
- dataset #2: 90K pairs
- dataset #3: 150K pairs
data augmentation: linear distortions (x8)
- augmented dataset #1: 240K pairs
- augmented dataset #2: 720K pairs
- augmented dataset #3: 1.2M pairs

Hyperparameters optimization

Bayesian optimization framework
The paper's authors used Whetlab
Suggested by Quentin: HyperOpt, a Distributed Asynchronous Hyperparameter Optimization in Python

https://www.crunchbase.com/organization/whetlab

https://github.com/hyperopt/hyperopt

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

When should I use a Siamese Net?

One-Shot learning
Many classes, few element in each class
A reasonable dataset for trainning the verification model
- ~ [15-40 characters] x [40 alphabets] = ~1K classes
- ~ 20 examples per class
- Omniglot tasks is pretty simple though, a real task would require more data

Hypotheses ! Should be empirically tested !

Siamese Neural Networks

One-shot learning
The Omniglot dataset
Other approaches
Siamese neural nets
Benchmark results
When should I use Siamese NN?
Keras Implementation

https://github.com/Goldesel23/Siamese-Networks-for-One-Shot-Learning

https://sorenbouma.github.io/blog/oneshot/

https://www.cs.cmu.edu/~rsalakhu/papers/oneshot1.pdf

The paper:

A blog post about Siamese Networks:

A Keras implementation:

Ressources

Siamese Neural Networks for One-shot Image Recognition

By Antoine Toubhans

Siamese Neural Networks for One-shot Image Recognition

7 years ago
1,365

Siamese Neural Networks for One-shot Image Recognition

Café de la Recherche #6 @Sicara

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks

Siamese Neural Networks for One-shot Image Recognition

More from Antoine Toubhans