Content generation in Tensorflow
Martyn Garcia
Mandatory disclaimer
- not imagination
- deep networks
- unique content generation
Statistical models
HyperGAN
focused on scaling
https://github.com/255BITS/HyperGAN
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3242952/images-1479408129921-18faea72-5c4a-42b3-9388-cc73ef1b2456.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3242953/images-1479408129914-ec815aad-66c8-444d-96d6-8c587eb68601.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3242957/images-1479408129905-d4cf64a2-9d19-474c-8f2d-94e66fc865aa.png)
Unsupervised learning
Supervised data for a dog dataset:
* each image labelled by breed
* hard, error prone
Unsupervised data for a dog dataset:
* scrape reddit/r/dogpics
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3243493/460849025dde416480cf88704e597fc6.jpg)
Sequence
- char rnn
- sketch
- wavenet
Real-valued
- autoencoder
- gan
- vae
- hybrid
TF Quiz
What type of layer does this function create?
tf.nn.xw_plus_b
TF Quiz
What type of layer does this function create?
tf.nn.xw_plus_b
Linear layer
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244274/slack-imgs-1.com.png)
char-rnn
Character
vector
RNN
Softmax
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3243518/Screen_Shot_2016-11-17_at_2.05.00_PM.png)
char-rnn
Character vectors example
a | (1 0 0) |
b | (0 1 0) |
c | (0 0 1) |
x = np.eye(vocab)[char]
tf_x = tf.constant(x)
sess.run(tf_x)
char-rnn
RNN
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244253/slack-imgs.com.jpeg)
tf.nn.rnn_cell.LSTMCell
tf.nn.rnn_cell.GRUCell
char-rnn
Softmax loss
net = rnn(input)
net = linear(net, vocab_size)
loss = tf.nn.softmax_cross_entropy_with_logits(net, y)
char-rnn
Example output
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3243368/Screen_Shot_2016-11-17_at_1.35.21_PM.png)
char-rnn
Drawbacks:
Sampling
Invalid character combinations
No higher level meaning
TF QUIZ
What is this called?
net = tf.maximum(0, net)
TF QUIZ
tf.nn.relu
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244192/imgres.jpg)
sketch
https://github.com/hardmaru/sketch-rnn
https://arxiv.org/pdf/1308.0850v5.pdf
Stroke
vector
RNN
Mixture
Density
Network
Conv
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244198/full_padding_no_strides_transposed.gif)
Locally connected layers with shared weights
https://github.com/vdumoulin/conv_arithmetic
tf.nn.conv2d
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3240588/Screen_Shot_2016-11-17_at_12.11.04_AM.png)
TF QUIZ
What is this called?
net = tf.maximum(net, 0.2*net)
TF QUIZ
Leaky Relu!
def lrelu(x, leak=0.2, name="lrelu"): | |
return tf.maximum(x, leak*x) | |
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244188/url.jpg)
Wavenet
https://github.com/ibab/tensorflow-wavenet
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3243676/network.png)
Generates audio using a pixel-cnn inspired architecture.
Incredible results from Google
https://deepmind.com/blog/wavenet-generative-model-raw-audio/
Auto Encoder
Encoder
z
Generator
Loss
MSE(G(z), x)
x
tf.square(G(z)-x)
MSE in Tensorflow
64x64x3
128
64x64x3
Auto Encoder
Encoder
z
Generator
x
Problem: This doesn't work well
VAE
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3243698/2d-visualization_720.png)
z = 2
http://github.com/255bits/hyperchamber
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244085/TzX3I.png)
GAN
z
Generator
gloss
Discriminator
x
dloss
Xent(D(G(z)),1)
Xent(D(x),1) + Xent(D(G(z)),0)
GAN
z
Generator
Discriminator
x
This, and the variations that are now being proposed is the most interesting idea in the last 10 years in ML, in my opinion.
LeCun https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs-in-deep-learning
Other techniques
- Real NVP (Invertible discriminator)
- Pixel RNN / CNN
- Image generation from caption
- Super Resolution
- Machine Translation
- Auto-captioning images
- Exciting papers constantly being released
- Not a comprehensive list
HyperGAN
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244111/slack-imgs.com.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244112/pasted_image_at_2016_11_17_04_55_pm_360.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244123/images-1479427080042-c7206637-2b19-4278-bb7d-80a21c4a0d7e.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244124/images-1479427080032-9e9e0921-fe72-4419-80b6-cd5d0b8651a7.png)
![](https://s3.amazonaws.com/media-p.slid.es/uploads/547107/images/3244126/images-1479427080015-3286a097-2e5c-41d1-aeb1-e9524a1a4792.png)
Q/A
Content generation in Tensorflow
By Martyn Garcia
Content generation in Tensorflow
- 2,614