Arvin Liu @ AISS 2020
Data
(Input)
Model
(Function)
Loss
Function
Optimizer
Output
(Output)
Expected
(Answer)
i.e., pictures, music, voice, etc.
Data
(Input)
Model
(Function)
Output
(Output)
Expected
(Answer)
???
hyper-params
real data
L
---
???
random vector (z)
Model /
Generator
Skill Point -G-> Character
???
image resource: 李宏毅老師的投影片
Given skill points to generate something.
Data
(Input)
Model
(Function)
Output
(Output)
Expected
(Answer)
random
hyper-params
real data
---
???
Model
Model /
Discriminator
How real is the data
Data
(Input)
Score
(Output)
L
???
Model /
Discriminator
(How real is the data,
[0, 1])
Data
(Input)
Score
(Output)
Model /
Discriminator
0.01
0.5
0.9
Generator
(G)
Generated
Data
Real
Data
random vector (z)
Discriminator
(D)
Score
Loss + Optimizer(G)
Loss+
Optimizer(D)
Generator
(G)
Generated
Data
Real
Data
random vector (z)
Discriminator
(D)
Score
Loss+
Optimizer(G)
Loss+
Optimizer(D)
Generator
(G)
Generated
Data
Real
Data
random vector (z)
Discrimintaor
(D)
Score
Loss+
Optimizer(G)
Loss+
Optimizer(D)
Data
(Input)
Model
(Function)
Output
(Output)
Environment
Action
Reward
Loss Function
Agent
Optimizer
Random vector
Generator
Fake Data
Discriminator's score
(If cannot open successfully, use incognito mode to open it.)
image shape: (28, 28) gray scale
Generator
(G)
Generated
Data
Real
Data
random vector (z)
Discrimintaor
(D)
Score
Loss+Opt(G)
Loss+Opt(D)
from torchvision import transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(mean=(0.5,), std=(0.5,))])
mnist = MNIST(root='./data/', train=True, transform=transform, download=True)
data_loader = DataLoader(dataset=mnist, batch_size=32, shuffle=True)
import torch
import torch.nn as nn
# Discriminator
D = nn.Sequential(
nn.Linear(28*28, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 1),
nn.Sigmoid()).cuda()
# Generator
G = nn.Sequential(
nn.Linear(64, 256),
nn.ReLU(),
nn.Linear(256, 256),
nn.ReLU(),
nn.Linear(256, 28*28),
nn.Tanh()).cuda()
ReLU
Leaky ReLU
tanh
sigmoid
value in (-1, 1)
value in (0, 1)
# Binary cross entropy loss and optimizer
criterion = nn.BCELoss()
d_optimizer = torch.optim.Adam(D.parameters(), lr=0.0002)
g_optimizer = torch.optim.Adam(G.parameters(), lr=0.0002)
for epoch in range(100):
for images, _ in data_loader:
batch_size = images.shape[0]
images = images.view(batch_size, 784).cuda()
real_labels = torch.ones(batch_size, 1).cuda()
fake_labels = torch.zeros(batch_size, 1).cuda()
train_discriminator()
train_generator()
def train_generator():
z = torch.randn(batch_size, 64).cuda()
fake_image = G(z)
g_loss = criterion(D(fake_image), real_labels)
g_optimizer.zero_grad()
g_loss.backward()
g_optimizer.step()
Generator
(G)
Generated
Data
random vector (z)
Discrimintaor
(D)
Score
Optimizer(G)
loss
def train_discriminator():
z = torch.randn(batch_size, 64).cuda()
d_loss_fake = criterion(D(G(z)), fake_labels)
d_loss_real = criterion(D(images), real_labels)
d_loss = d_loss_real + d_loss_fake
d_optimizer.zero_grad()
d_loss.backward()
d_optimizer.step()
Generated
Data
Real
Data
Discrimintaor
(D)
Score
Optimizer(D)
loss
I think it's useless...
Generator
(G)
Generated
Data
random vector (z)
condition
(c)
Voice
Generated Data
Condition
安安
TTS (Text2Speech)
Style Transfer