Youth in high dimensions 2022
Mario Geiger
Postdoc at MIT with Prof. Smidt
input
output
Illustration of a neural network equivariant to rotations in 3D
What affects data efficiency in equivariant neural networks?
a,b,c,e∈G
The Vectors
x1x2x3⟶Rx1x2x3
Few examples
The Vectors
x1x2x3⟶Rx1x2x3
The Scalars
x⟶x
Few examples
The Vectors
x1x2x3⟶Rx1x2x3
The Scalars
x⟶x
Signal on the Sphere
f:S2→R
f′(x)=f(R−1x)
Few examples
The Vectors
x1x2x3⟶Rx1x2x3
The Scalars
x⟶x
Scalar Field
f:R3→R
f′(x)=f(R−1x)
Signal on the Sphere
f:S2→R
f′(x)=f(R−1x)
Few examples
(ρ,V)
ρ:G→(V→V) g,g1,g2∈G x,y∈V
The Vectors
x1x2x3⟶Rx1x2x3
The Scalars
x⟶x
Scalar Field
f:R3→R
f′(x)=f(R−1x)
Signal on the Sphere
f:S2→R
f′(x)=f(R−1x)
The Vectors
x1x2x3⟶Rx1x2x3
The Scalars
x⟶x
Scalar Field
f:R3→R
f′(x)=f(R−1x)
Signal on the Sphere
f:S2→R
f′(x)=f(R−1x)
irreducible
irreducible
reducible
reducible
Scalar Field
f:R3→R
f′(x)=f(R−1x)
reducible
=
c1×
c2×
c3×
c4×
c5×
irreducible
c6×
Index | Name | Examples of quantities |
---|---|---|
L=0 | Scalars | temperature, norm of a vector, orbital s, ... |
L=1 | Vectors | velocity, force, orbital p, ... |
L=2 | orbital d | |
L=3 | orbital f | |
L=4 | orbital g | |
L=5 | ... | |
L=6 | ||
L=7 | ||
L=8 | ||
L=9 | ||
L=10 | ||
L=11 |
Index | Name | Examples of quantities |
---|---|---|
L=0 | Scalars | temperature, norm of a vector, orbital s, ... |
L=1 | Vectors | velocity, force, orbital p, ... |
L=2 | orbital d | |
L=3 | orbital f | |
L=4 | orbital g | |
L=5 | ... | |
L=6 | ||
L=7 | ||
L=8 | ||
L=9 | ||
L=10 | ||
L=11 |
Stress Tensor
(3x3 matrix)
}
σ⟶RσRT
Everything can be decomposed into irreps:
ρ1⊗ρ2 is a representation
acting on the vector space V1⊗V2
X∈RdimV1×dimV2
X⟶ρ1(g)Xρ2(g)T
ρ1⊗ρ2 is a representation
acting on the vector space V1⊗V2
X∈RdimV1×dimV2
X⟶ρ1(g)Xρ2(g)T
(Xij⟶ρ1(g)ikρ2(g)jlXkl)
reducible
=
direct sum of
irreducible
ρ1⊗ρ2
ρ3⊕ρ4⊕ρ4
G
ρ1
ρ2
ρ3
ρ4
ρ5
⊗
ρ5
ρ1
ρ2
Example:
D2⊗D1=D1⊕D2⊕D3
DL is the irreps of order L
DL is the irreps of order L
General formula:
Dj⊗Dk=D∣j−k∣⊕⋯⊕Dj+k
Example:
D2⊗D1=D1⊕D2⊕D3
Using the tools presented previously you can create any equivariant polynomials
Equivariant
Polynomial
θ
ρ1
ρ2
ρ2
ρ3
ρ1
ρ1
ρ2
ρ3
ρ1
ρ2
ρ4
ρ4
ρ1
⊗
⊗
⊗
⊕
⊕
⊕
⊕
⊗
⊗
⊗
⊕
⊕
⊕
⊕
θ
θ
Group | Name | Ref |
---|---|---|
Translation | Convolutional Neural Networks | |
90 degree rotation 2D | Group Equivariant CNN | 1602.07576 |
2D Rotations | Harmonic Networks | 1612.04642 |
2D Scale | Deep Scale-spaces | 1905.11697 |
3D Rotations | 3D Steerable CNN, Tensor Field Network | 1807.02547 1802.08219 |
Lorentz | Lorentz Group Equivariant NN | 2006.04780 |
We wrote python code to help creating Equivariant Neural Networks
$ pip install e3nn
We wrote python code to help creating Equivariant Neural Networks
$ pip install e3nn
import e3nn e3nn.o3.spherical_harmonics(2, x, True)
Spherical Harmonics are Equivariant Polynomials
(TFN: Nathaniel Thomas et al. 2018)
(Nequip: Simon Batzner et al. 2021)
source
dest.
h
r
m=h⊗Y(r)
m
* this formula is missing the parameterized radial function
(Nequip: Simon Batzner et al. 2021)
max L of the messages
P= size of trainset
d= dimension of the data
δ= distance to closest neighbor
Bach (2017)
P= size of trainset
d= dimension of the data
δ= distance to closest neighbor
ϵ= test error
Hestness et al. (2017)
regression + Lipschitz continuous
Luxburg and Bousquet (2004)
(MACE: Ilyes Batatia et al. 2022)
source
1
dest.
h1
r1
m=Fθ({hi⊗Y(ri)}i=1ν)
m
source
2
source
ν
h2
hν
r2
rν
(MACE: Ilyes Batatia et al. 2022)
L
L
3
m=Fθ({hi⊗Y(ri)}i=1ν)
any L and ν=1
h⊗Y(r)
L=0 and ν=2
h1Y(r1)⋅h2Y(r2)
Legendre polynomials
L=0 and ν=3
(h1Y(r1)⊗h2Y(r2))⋅h3Y(r3)
any L and ν=3
h1⊗Y(r1)⊗h2⊗Y(r2)⊗h3⊗Y(r3)
Equivariant Neural Networks are more data efficient if they incorporate Tensor Products of order L≥1
but not necessary as features (MACE)
The slides are available at
https://slides.com/mariogeiger/youth2022