a defence against adversarial examples

Alvin Chan

Jacobian

JARN:

Adversarial Attacks

stop sign

90 km/h

Deep Learning models are still vulnerable to adversarial attacks despite new defenses
Adversarial attacks can be imperceptible to human

Misclassification of image recognition
- Face recognition
- Object detection
- Image segmentation
Reinforcement learning

Optimizing two components:
- Distance between the clean and adversarial input
- Label prediction of image

White-box: Access to architecture & hyperparameters
Black-box: Access to target model’s prediction
- Transfer attacks from single or an ensemble of substitute target models

Image

Standard

PGD7

Real

Fake

https://slides.com/alvinchan/jarn_dso