a defence against adversarial examples
Alvin Chan
Adversarial Attacks
stop sign
90 km/h
Misclassification of image recognition
Face recognition
Object detection
Image segmentation
Reinforcement learning
Mostly used in Computer Vision domain
Uses gradient of the target models to directly perturb pixel values
Optimizing two components:
White-box: Access to architecture & hyperparameters
Black-box: Access to target model’s prediction
Transfer attacks from single or an ensemble of substitute target models
Image
Standard
PGD7
Real
Fake
https://slides.com/alvinchan/jarn_dso