Flux.jl
Relax! Flux is the ML library that doesn't make you tensor http://fluxml.ai/
About Me
- Student
Why Julia?
Better performance
- JIT compiler
- faster pre- & post- processing
Better syntax
- multiple dispatch
- metaprogramming
Write elegant code with high performance
WARNING
Situation of v0.7
Julia v0.7 in rc- packages need to be update
- mess in document
Julia v1.0 Just Release!!
Other Frameworks
Dynamic
- Pytorch
- Tensorflow-eager
- Mxnet
Static
- Theano
- Tensorflow
- Caffe
- Mxnet
Static v.s. Dynamic
Automatic Differentiation
(AD)
Static
create computation graph beforehand
need to be specific, e.g. shape, graph ...
Dynamic
create computation as the data pass forward
everything could be dynamic
Dynamic
Why use Flux?
Advantages
- highly customizable
- make use of julialang's advantages
- dynamic model
- write model at ease
- FUN!
Take a look!
Flux in one block
using Flux
using Flux: throttle, @epochs
x = rand(784)
y = rand(10)
data = Iterators.repeated((x, y), 3)
m = Chain(
Dense(784, 32, σ),
Dense(32, 10), softmax)
loss(x, y) = Flux.mse(m(x), y)
opt = ADAM(params(m))
evalcb = () -> @show(loss(x, y))
@epochs 3 Flux.train!(loss, data, opt, cb = throttle(evalcb, 10))
Flux in one block(dynamic)
using Flux
tree() = rand() < 0.5 ? rand(10) : (tree(), tree())
shrink = Dense(20, 10)
combine(a, b) = shrink([a; b])
model(x) = x
model(x::Tuple) = combine(model(x[1]), model(x[2]))
model(tree()) # Sample output
Flux on browser
From basic
(the internal)
Flux.Tracker
- The core module for computing AD
- reverse-mode AD
take derivative
using Flux.Tracker
Tracker.gradient((a, b) -> a*b, 2, 3) # (3.0 (tracked), 2.0 (tracked))
#= equivalent =#
using Flux.Tracker: forward
y, back = forward((a, b) -> a*b, 2, 3) # (6.0 (tracked), Flux.Tracker.#9)
back(1) # (3.0 (tracked), 2.0 (tracked))
take 2-order derivative
using Flux.Tracker
f(x) = 3x^2 + 2x + 1
# df/dx = 6x + 2
f′(x) = Tracker.gradient(f, x)[1]
f′(2) # 14.0 (tracked)
# d²f/dx² = 6
f′′(x) = Tracker.gradient(f′, x)[1]
f′′(2) # 6.0 (tracked)
take derivative
(in place)
using Flux.Tracker: forward
y, back = forward((a, b) -> a*b, 2, 3) # (6.0 (tracked), Flux.Tracker.#9)
back(1) # (3.0 (tracked), 2.0 (tracked))
a, b = param(2), param(3)
c = a*b # 6.0 (tracked)
Tracker.back!(c)
Tracker.grad(a), Tracker.grad(b) # (3.0, 2.0)
take derivative of matrix
W = param([1 2; 3 4])
x = param([5, 6])
y = W*x
#Tracked 2-element Array{Float64,1}:
# 17.0
# 39.0
c = sum(y)
Tracker.back!(c)
Tracker.grad(W), Tracker.grad(x) # ([5.0 6.0; 5.0 6.0], [4.0, 6.0])
Customize gradient
using Flux
using Flux: data
using Flux.Tracker
using Flux.Tracker: TrackedReal, track, @grad, TrackedMatrix
foo(a, b) = a * b .+ 10
foo(a::TrackedMatrix, b::TrackedMatrix) = Tracker.track(foo, a, b)
@grad function foo(a, b)
f = foo(data(a),data(b))
x = similar(data(a))
y = similar(data(b))
for i ∈ 1:length(x)
x[i] = i
end
for i ∈ 1:length(y)
y[i] = i^2
end
return f, Δ -> (x, y)
end
Customize gradient
a = param([1 2; 4 5])
b = param([5 6 2; 7 8 1])
c = foo(a, b)
Flux.Tracker.back!(sum(c))
Tracker.grad(a)
#2×2 Array{Float64,2}:
# 1.0 3.0
# 2.0 4.0
Tracker.grad(b)
#2×3 Array{Float64,2}:
# 1.0 9.0 25.0
# 4.0 16.0 36.0
let's build a model!
Outline
- prepare data
- model construction
- loss function & optimizer
- training!
Prepare Data
- batch should be at second Dimension
- utils.jl
- Flux.Data
- FluxML/model-zoo
data example
using Flux
using Flux: chunk, batch
xs = collect(Iterators.repeated(rand(10), 1000))
ck = chunk(xs, 50)
data = batch.(ck)
Build a Model
- 1. `Chain` multiple Layers
- 2. write a custom function/layer
model example(1)
using Flux
using Flux: @treelike, glorot_uniform
struct Nalu{S}
W::S
M::S
G::S
end
function Nalu(in::Integer, out::Integer;
initW = glorot_uniform)
return Nalu(param(initW(out, in)), param(initW(out, in)),param(initW(out, in)))
end
@treelike Nalu
function (n::Nalu)(x)
W = @. tanh(n.W) * σ(n.M)
a = W * x
g = σ.(n.G * x)
m = ℯ .^ (W * log.(abs.(x) + 1e-7))
y = @. g * a + (1 - g) * m
return y
end
model example(2)
N = 300
embedding = param(randn(N, length(alphabet)))
W = Dense(2N, N, tanh)
combine(a, b) = W([a; b])
sentiment = Chain(Dense(N, 5), softmax)
function forward(tree)
if isleaf(tree)
token, sent = tree.value
phrase = embedding * token
phrase, crossentropy(sentiment(phrase), sent)
else
_, sent = tree.value
c1, l1 = forward(tree[1])
c2, l2 = forward(tree[2])
phrase = combine(c1, c2)
phrase, l1 + l2 + crossentropy(sentiment(phrase), sent)
end
end
loss(tree) = forward(tree)[2]
Optimizer
- Pass all params to Optimizers
- `params()` collect all param in layers
Training!
function train!(loss, data, opt; cb = () -> ())
cb = runall(cb)
opt = runall(opt)
@progress for d in data
l = loss(d...)
@interrupts back!(l)
opt()
cb() == :stop && break
end
end
Save model
saveing&loading
julia> using Flux
julia> using BSON: @save
julia> using BSON: @load
#save
julia> model = Chain(Dense(10,5,relu),Dense(5,2),softmax)
Chain(Dense(10, 5, NNlib.relu), Dense(5, 2), NNlib.softmax)
julia> @save "mymodel.bson" model
#load
julia> @load "mymodel.bson" model
julia> model
Chain(Dense(10, 5, NNlib.relu), Dense(5, 2), NNlib.softmax)
ONNX support!
loading ONNX model
# Import the required packages.
julia> using Flux, ONNX
# If you are in some other directory, specify the entire path.
# This creates two files: model.jl and weights.bson.
julia> ONNX.load_model("model.onnx")
# Read the weights from the binary serialized file.
julia> weights = ONNX.load_weights("weights.bson")
# Loads the model from the model.jl file.
julia> model = include("model.jl")
Conclusion
Flux!
- writing elegant code on top of Julia
- dynamic model make you feel at ease
- easy&highly customizable
Q & A
Intro to Flux.jl
By Peter Cheng
Intro to Flux.jl
Relax! Flux is the ML library that doesn't make you tensor
- 2,256