Piotr Kozakowski
maximize
maximize
Training procedure:
Training procedure:
Commonly,
and
How to backpropagate through
?
Commonly,
and
How to backpropagate through
?
Let
and
.
Backpropagate as usual, treating
as a constant.
Corneil et al. - Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation (2018)
Kaiser et al. - Model-Based Reinforcement Learning for Atari (2019)
Reparametrization trick for the categorical distribution:
Still can't backpropagate though.
generates a sample
with
with
generates a sample
Jang et al - Categorical Reparameterization with Gumbel-Softmax, 2016
Approximate
with
with
Temperature annealing: as
,
is differentiable - can backpropagate!
Jang et al - Categorical Reparameterization with Gumbel-Softmax, 2016
Drop-in replacement for the normal distribution in VAE:
def sample_gumbel(shape, eps=1e-20):
u = torch.rand(shape)
return -torch.log(-torch.log(u + eps) + eps)
def gumbel_softmax(logits, temperature):
y = logits + sample_gumbel(logits.size())
return F.softmax(y / temperature, dim=-1)
Discretize
half of the time,
but backpropagate as if it was not discretized.
Binary latent variables.
with
Noise forces
to extreme values.
Kaiser et al - Discrete Autoencoders for Sequence Models, 2018
No probabilistic interpretation and no KL loss.
No prior to sample the latent from.
Solution: predict the latent autoregressively as a sequence of bits using an LSTM.
Predict several bits at a time.
Source: Tensor2Tensor
def saturating_sigmoid(logits):
return torch.clamp(
1.2 * torch.sigmoid(logits) - 0.1, min=0, max=1
)
def mix(a, b, prob=0.5):
mask = (torch.rand_like(a) < prob).float()
return mask * a + (1 - mask) * b
def improved_semantic_hashing(logits, noise_std=1):
noise = torch.normal(
mean=torch.zeros_like(logits), std=noise_std
)
noisy_logits = logits + noise
continuous = saturating_sigmoid(noisy_logits)
discrete = (
(noisy_logits > 0).float() +
continuous - continuous.detach()
)
return mix(continuous, discrete)
Gumbel-softmax
Improved semantic hashing
Procedure:
Metric: binary cross-entropy
Procedure:
Metric: Inception score
for a generator
and pretrained classifier
GS, 30 x 10
ISH, 96 x 2
GS, 16 x 2
ISH, 32 x 2
Speaker:
Presentation:
Code:
References:
Jang et al. - Categorical Reparameterization with Gumbel-Softmax (2016)
Kaiser et al. - Discrete Autoencoders for Sequence Models (2018)
Corneil et al. - Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation (2018)
Kaiser et al. - Model-Based Reinforcement Learning for Atari (2019)
https://slides.com/piotrkozakowski/discrete-autoencoders
https://github.com/koz4k/gumbel-softmax-vs-discrete-ae
Piotr Kozakowski