He Wang PRO
Knowledge increases by sharing but not by saving.
Based on :
[1] My tech post, "A neural network wasn't built in a day" (一段关于神经网络的故事) (2017)
[2] 1903.01998, 1909.06296, 2002.07656, 2008.03312; PRL(2020) 124 041102
Journal Club - Oct 20, 2020
Objective:
Yes or No
A number
A sequence
What's happend in a neural?(一个神经元的本事)
Objective:
"score"
"your performance"
"one judge"
What's happend in a neural?(一个神经元的本事)
"a bunch guys' show"
Objective:
"scores"
"one judge"
What's happend in a neural?(一个神经元的本事)
ReLU
Objective:
"a bunch guys' show"
"one judge"
"scores"
Generalize to one layer of neural(层状的神经元)
Objective:
"score"
"10 judges"
"your performance"
Generalize to one layer of neural(层状的神经元)
Objective:
"10 judges"
"scores"
"a bunch guys' show"
Fully-connected neural layers(全连接的神经元层)
Objective:
In each layer:
input
output
num of neurals in this layer
Draw how one sample data flows
And how the shape of data changs.
Objective:
Convolution is a specialized kind of linear operation.(卷积层是全连接层的一种特例)
Objective:
Neural networks in academic papers.(GW文献中的神经网络)
PRD. 100, 063015 (2019)
Mach. Learn.: Sci. Technol. 1 025014 (2020)
2003.09995
Expert Systems With Applications 151 (2020) 113378
"All of the current GW ML parameter estimation studies are still at the proof-of-principle stage" [2005.03745]
Real-time regression
Explicit posteriors density
Suppose we have a posterior distribution ptrue(x∣y). (y is the GW data, x is the corresponding parameters)
fixed; costly sampling required
fixed; costly sampling required
Bayes' theorem
Chua et al. [PRL, 2020, 124, 041102] assume a multivariate normal distribution with weights:
NN
Suppose we have a posterior distribution ptrue(x∣y). (y is the GW data, x is the corresponding parameters)
fixed; costly sampling required
Bayes' theorem
Chua et al. [PRL, 2020, 124, 041102] assume a multivariate normal distribution with weights:
Gabbard et al. [1909.06296] (CVAE)
NN
NN
NN
NN
Suppose we have a posterior distribution ptrue(x∣y). (y is the GW data, x is the corresponding parameters)
fixed; costly sampling required
Bayes' theorem
Chua et al. [PRL, 2020, 124, 041102] assume a multivariate normal distribution with weights:
NN
Gabbard et al. [1909.06296] (CVAE)
NN
NN
NN
Green et al. [2002.07656] (MAF, CVAE+)
Suppose we have a posterior distribution ptrue(x∣y). (y is the GW data, x is the corresponding parameters)
CVAE (Train)
Y (·, 256)
X (·, 5)
KL
E2
E1
X′ (·, 5)
D
(2, 8)
FYI: if dims of latent space is 2, i.e.
sample z from N(μ,Σ2)
Y (·, 256)
(1, 8)
Training set: N=106
batchsize = 512
(·, 8)
[(μm1,σm1),...]
X1
X2
Xn
latent space
strain
params
CVAE: conditional variational autoencoder
Y (·, 256)
X (·, 5)
KL
E2
E1
X′ (·, 5)
D
(2, 8)
FYI: if dims of latent space is 2, i.e.
sample z from N(μ,Σ2)
Y (·, 256)
(1, 8)
Training set: N=106
batchsize = 512
(·, 8)
[(μm1,σm1),...]
X1
X2
Xn
latent space
strain
params
P
P
Q
CVAE (Train)
Objective: maximise LELBO (与数据点 X 相关联的变分下界)
ELBO: Evidence Lower Bound
CVAE: conditional variational autoencoder
FYI: if dims of latent space is 2, i.e.
X1
X2
Xn
latent space
CVAE (Test)
Y (·, 256)
E1
X′ (·, 5)
D
sample z from N(μ,Σ2)
Y (·, 256)
(1, 8)
Training set: N=106
batchsize = 512
(·, 8)
[(μm1,σm1),...]
strain
P
(2, 8)
P
Objective: maximise LELBO (与数据点 X 相关联的变分下界)
ELBO: Evidence Lower Bound
CVAE: conditional variational autoencoder
Y (·, 256)
X (·, 5)
KL
E2
E1
X′ (·, 5)
D
(2, 8)
sample z from N(μ,Σ2)
Y (·, 256)
(1, 8)
Training set: N=106
batchsize = 512
(·, 8)
[(μm1,σm1),...]
strain
params
Objective: maximise LELBO (与数据点 X 相关联的变分下界)
P
P
Q
CVAE
Drawbacks:
Mystery:
The final slice. 💪
ELBO: Evidence Lower Bound
CVAE: conditional variational autoencoder
By He Wang
Abstract: Firstly, I will talk about some basic concepts of deep neural networks and I hope it would help clear up misunderstandings and rumors related to understand how a neural network works, etc. Then, based on these concepts, I will try to briefly review the current GW ML parameter estimation studies (1903.01998, 1909.06296, PRL(2020) 124 041102, 2002.07656, 2008.03312; selected), especially how they try to built up a neural network to estimate the posterior distribution. The relative drawbacks and mysteries of their works are also mentioned.