CoLight: Learning Network-level

Cooperation for Traffic Signal Control

ABSTRACT

Cooperation among the traffic signals enables vehicles to move through intersections more quickly. Conventional transportation approaches implement cooperation by pre-calculating the offsets between two intersections. Such pre-calculated offsets are not suitable for dynamic traffic environments.
To enable cooperation of traffic signals, in this paper, we propose a model, CoLight, which uses graph attentional networks to facilitate communication. Specifically, for a target intersection in a network, CoLight can not only incorporate the temporal and spatial influences of neighboring intersections to the target intersection, but also build up index-free modeling of neighboring intersections. To the best of our knowledge, we are the first to use graph attentional networks in the setting of reinforcement learning for traffic signal control and to conduct experiments on the large-scale road network with hundreds of traffic signals. In experiments, we demonstrate that by learning the communication, the proposed model can achieve superior performance against the state-of-the-art methods.

key contributions

Cooperation through dynamic communication

Index-free model learning with parameter sharing

Experiment on the large-scale road network

key contributions

Cooperation through dynamic communication

Index-free model learning with parameter sharing

Experiment on the large-scale road network

key contributions

Cooperation through dynamic communication

Index-free model learning with parameter sharing

Experiment on the large-scale road network

key contributions

Cooperation through dynamic communication

Index-free model learning with parameter sharing

Experiment on the large-scale road network

METHOD

METHOD

o_1^t, o_2^t, o_3^t, ..., o_n^t

METHOD

o_1^t, o_2^t, o_3^t, ..., o_n^t
Embed(o^t_i)=ReLU(o^t_iW_e+b_e)
h_1^t, h_2^t, h_3^t, ..., h_n^t

METHOD

h_1^t, h_2^t, h_3^t, ..., h_n^t

Get Q, K, V

Q(h^t_i)=h^t_iW_q
K(h^t_i)=h^t_iW_k
V(h^t_i)=h^t_iW_v
q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
v_1^t, v_2^t, v_3^t, ..., v_n^t

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
e_{11}^t, e_{12}^t, e_{13}^t, ..., e_{1n}^t
e_{21}^t, e_{22}^t, e_{23}^t, ..., e_{2n}^t
e_{31}^t, e_{32}^t, e_{33}^t, ..., e_{3n}^t
e_{n1}^t, e_{n2}^t, e_{n3}^t, ..., e_{nn}^t

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
e_{11}^{t'}, e_{12}^{t'}, e_{13}^{t'}, ..., e_{1n}^{t'}
e_{21}^{t'}, e_{22}^{t'}, e_{23}^{t'}, ..., e_{2n}^{t'}
e_{31}^{t'}, e_{32}^{t'}, e_{33}^{t'}, ..., e_{3n}^{t'}
e_{n1}^{t'}, e_{n2}^{t'}, e_{n3}^{t'}, ..., e_{nn}^{t'}
e^{t'}_{ij}=e^{t}_{ij}/τ, τ=\sqrt{d_k}

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
e_{11}^{t'}, e_{12}^{t'}, e_{13}^{t'}, ..., e_{1n}^{t'}
e_{21}^{t'}, e_{22}^{t'}, e_{23}^{t'}, ..., e_{2n}^{t'}
e_{31}^{t'}, e_{32}^{t'}, e_{33}^{t'}, ..., e_{3n}^{t'}
e_{n1}^{t'}, e_{n2}^{t'}, e_{n3}^{t'}, ..., e_{nn}^{t'}
Mask(e^{t'}_{ij}) = -\infty
exp(-\infty)=0
softmax(e_{ij})=\frac{exp(e_{ij})}{\sum_{j\in N_i}exp(e_{ij})}

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
e_{11}^{t'}, e_{12}^{t'}, e_{13}^{t'}, ..., e_{1n}^{t'}
e_{21}^{t'}, e_{22}^{t'}, e_{23}^{t'}, ..., e_{2n}^{t'}
e_{31}^{t'}, e_{32}^{t'}, e_{33}^{t'}, ..., e_{3n}^{t'}
e_{n1}^{t'}, e_{n2}^{t'}, e_{n3}^{t'}, ..., e_{nn}^{t'}
Mask(e^{t'}_{ij}) = -\infty
exp(-\infty)=0
softmax(e_{ij})=\frac{exp(e_{ij})}{\sum_{j\in N_i}exp(e_{ij})}

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
e_{11}^{t'}, e_{12}^{t'}, e_{13}^{t'}, ..., e_{1n}^{t'}
e_{21}^{t'}, e_{22}^{t'}, e_{23}^{t'}, ..., e_{2n}^{t'}
e_{31}^{t'}, e_{32}^{t'}, e_{33}^{t'}, ..., e_{3n}^{t'}
e_{n1}^{t'}, e_{n2}^{t'}, e_{n3}^{t'}, ..., e_{nn}^{t'}
Mask(e^{t'}_{ij}) = -\infty
exp(-\infty)=0
softmax(e_{ij})=\frac{exp(e_{ij})}{\sum_{j\in N_i}exp(e_{ij})}

METHOD

q_1^t, q_2^t, q_3^t, ..., q_n^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
a_{11}^{t'}, a_{12}^{t'}, a_{13}^{t'}, ..., a_{1n}^{t'}
a_{21}^{t'}, a_{22}^{t'}, a_{23}^{t'}, ..., a_{2n}^{t'}
a_{31}^{t'}, a_{32}^{t'}, a_{33}^{t'}, ..., a_{3n}^{t'}
a_{n1}^{t'}, a_{n2}^{t'}, a_{n3}^{t'}, ..., a_{nn}^{t'}

METHOD

q_1^t
k_1^t, k_2^t, k_3^t, ..., k_n^t
a_{11}^{t'}
a_{21}^{t'}
a_{31}^{t'}
a_{n1}^{t'}

METHOD

a_{11}^{t'},
a_{21}^{t'},
a_{31}^{t'}, ...,
a_{n1}^{t'}
v_1^t, v_2^t, v_3^t, ..., v_n^t
h_{s1}=WV_{s1}+b
(v_1^{t'}, v_2^{t'}, v_3^{t'}, ..., v_n^{t'})sum = V_{s1}

METHOD

H=h_{s1}, h_{s2}, h_{s3}, ..., h_{sn}

CLF

CLF

CLF

CLF

0

2

3

1

Action

\tilde{q}(o^t_i) = h_{si}W_q+b_q

Performance

Made with Slides.com