a defence against adversarial examples

Alvin Chan

 Jacobian 

JARN:

 Adversarially Regularized Networks

Outline

  • Introduction
  • Target Domains
  • Attacks 
  • Defenses
  • Challenges & Discussion

Adversarial Attacks

stop sign

90 km/h

Introduction

  • Deep Learning models are still vulnerable to adversarial attacks despite new defenses
  • Adversarial attacks can be imperceptible to human

Computer Vision

  • Misclassification of image recognition

    • Face recognition

    • Object detection

    • Image segmentation

  • Reinforcement learning

Gradient-based Attacks

  • Mostly used in Computer Vision domain

  • Uses gradient of the target models to directly perturb pixel values

Gradient-based Attacks

  • Optimizing two components:

    • Distance between the clean and adversarial input
    • Label prediction of image

Gradient-based Attacks

  • White-box: Access to architecture & hyperparameters

  • Black-box: Access to target model’s prediction

    • Transfer attacks from single or an ensemble of substitute target models

Adversarial Training

  • Training on adversarial examples
  • Attacks used affects effectiveness

Robustness & Saliency

Image

Standard

PGD7

Robustness & Saliency

 Jacobian Adversarially

Regularized Networks

Real

Fake

 Jacobian Adversarially

Regularized Networks

 Jacobian Adversarially

Regularized Networks

Results

Cheers!

https://slides.com/alvinchan/jarn_dso

JARN @ DSO

By Alvin Chan

JARN @ DSO

  • 533