Architecturally Variant Artificial Networks

Universidade Estadual de Campinas

 

Lucas Oliveira David - ld492@drexel.edu

Introduction

Artificial Networks

Dense Layers

y \colon \mathbb{R}^m \to \mathbb{R}^n
y:RmRny \colon \mathbb{R}^m \to \mathbb{R}^n
y = \sigma(\bold X \cdot \bold W + \bold b)
y=σ(XW+b)y = \sigma(\bold X \cdot \bold W + \bold b)

...

...

Artificial Networks

Convolutional Layers (for images)

y \colon \mathbb{R}^{m\times n} \to \mathbb{R}^{o \times p}
y:Rm×nRo×py \colon \mathbb{R}^{m\times n} \to \mathbb{R}^{o \times p}
(\bold I \ast \bold K)(i, j) = \sum_{k=0}^h \sum_{l=0}^w \bold I (i - k, j -l)\bold K(k, l)
(IK)(i,j)=k=0hl=0wI(ik,jl)K(k,l)(\bold I \ast \bold K)(i, j) = \sum_{k=0}^h \sum_{l=0}^w \bold I (i - k, j -l)\bold K(k, l)
y_{i,j} = \bold I \ast \bold K(i, j)
yi,j=IK(i,j)y_{i,j} = \bold I \ast \bold K(i, j)
\forall i, j\text{ s.t. }0 \le i < m, 0 \le j < n
i,j s.t. 0i<m,0j<n\forall i, j\text{ s.t. }0 \le i < m, 0 \le j < n

Figure 3.3: Diagram of the connections held by one unit of a convolutional layer. Figures extracted from Michael A.
Nielsen, “Neural Networks and Deep Learning,” Determination Press, 2015. Available at: neu-
ralnetworksanddeeplearning.com/chap6. License: CC BY-NC 3.0.

\ast
\ast

Artificial Networks

Convolutional Layers (for images)

Example

\ast
\ast

...

Artificial Networks

Current Architectures

“GoogLeNet”, from "Going Deeper with Convolutions" . Fair use. Available at: cs.unc.edu/~wliu/papers/GoogLeNet.pdf

Artificial Intelligence

Artificial Intelligence

World

Representational State

Intelligent Agent

Perceive

Act

Artificial Intelligence

Example: Shortest Route Problem

World: Romanian graph.

Agent: What's the shortest route from Arad to Bucharest?

[Arad]

[Arad, Zerind]

[Arad, Timisoara]

[Arad, Sibiu]

...

...

...

...

...

...

...

...

...

...

...

...

Artificial Intelligence

Example: Shortest Route Problem

Arad

Zerind

Oradea

Sibiu

Rimnicu Vilcea

Pitesti

Bucharest

Greedy Best First Search

Arad

Sibiu

Rimnicu Vilcea

Pitesti

Bucharest

A Star Search

Searching Architectures

Searching Architectures

  • Find an architecture that satisfactorily solves the problem
  • Simplify architectures whenever possible

Searching Architectures

Representational State

Intelligent Agent

Perceive

Considers

Act

Searching Architectures

Utility of a State

u(s)=-h(s)
u(s)=h(s)u(s)=-h(s)
h(s) = \lambda_1 \sum_{i=1}^{|\text{layers}|} \frac{\text{units}_i - \text{min-units}}{\text{max-units} - \text{min-units}} +
h(s)=λ1i=1layersunitsimin-unitsmax-unitsmin-units+h(s) = \lambda_1 \sum_{i=1}^{|\text{layers}|} \frac{\text{units}_i - \text{min-units}}{\text{max-units} - \text{min-units}} +
\lambda_2 \frac{\text{layers} - \text{min-layers}}{\text{max-layers} - \text{min-layers}} +
λ2layersmin-layersmax-layersmin-layers+\lambda_2 \frac{\text{layers} - \text{min-layers}}{\text{max-layers} - \text{min-layers}} +
\lambda_3 \text{train-loss} +
λ3train-loss+\lambda_3 \text{train-loss} +
\lambda_4 \text{validation-loss}
λ4validation-loss\lambda_4 \text{validation-loss}
\sum_{i=0}^4 \lambda_i = 1
i=04λi=1\sum_{i=0}^4 \lambda_i = 1

Searching Architectures

Plausible Actions towards Finding Answers

  • Random
  • Cross-over
  • Mutate
  • Reduce conv2d layers
  • Reduce dense layers
  • Reduce kernels in the last conv2d layer
  • Reduce units in the last dense layer
  • Increase conv2d layers
  • Increase dense layers
  • Increase kernels in the last conv2d layer
  • Increase units in the last dense layer

Experiments

Digits

  • Classification between 10 labels
  • Evolutive search (genetic algorithm)
architecture {[
  (Conv2D, 47, 'relu'),
  (Dense, 10, 'softmax')
]}:
|-loss: 11.782062
|-validation loss: 3.841851

...
architecture {[
  (Conv2D, 47, 'relu'),
  (Dense, 40, 'relu'),
  (Dense, 10, 'softmax')
]}:
|-loss: 7.403219
|-validation loss: 2.209876

Cifar-10

  • Classification between 10 labels
  • Hill-Climbing
architecture {[
  (Conv2D, 47, 'relu'),
  (Conv2D, 54, 'relu'),
  (Conv2D, 104, 'relu'),
  (Conv2D, 200, 'relu'),
  (Dense, 75, 'relu'),
  (Dense, 10, 'softmax')
]}:
|-loss: 0.429305
|-validation loss: 2.174616

...
architecture {[
  ('Conv2D', 39, 'relu'),
  ('Conv2D', 127, 'relu'),
  ('Conv2D', 136, 'relu'),
  (Dense, 79, 'relu'),
  (Dense, 10, 'softmax')
]}:
|-loss: 0.422453
|-validation loss: 2.145460

Thank You

Neural Networks II

By Lucas David

Neural Networks II

  • 63