The dark side of deep learning

Milan, 11 November 2017

Just about any

DL presentation...

The deep learning craze

IBM speech recognition is on the verge of super-human accuracy [Business Insider, 2017]

Are Computers Already Smarter Than Humans?
[Time, 2017]

Artificial Intelligence Beats 'Most Complex Game Devised by Humans' [LiveScience, 2016]

Intel is paying more than $400 million to buy deep-learning startup Nervana Systems [RECODE, 2017]

I know there's a proverb which that says 'To err is human,' but a human error is nothing to what a computer can do if it tries.

--- Agatha Christie

People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.

--- Pedro Domingos

What about the limitations of DL?

DL is not magic - it is an incredibly powerful tool for extracting regularities from data according a given objective.

Corollary #1: A DL program will be just as smart as the data it gets.

Corollary #2: A DL program will be just as smart as the objective it optimizes.

Something to worry about #1

Bias and discrimination

Word embeddings

Can convert words to vectors of numbers - at the hearth of most NLP applications with deep learning

Notebook time!

Embeddings are highly sexists!

Bolukbasi, T., Chang, K.W., Zou, J., Saligrama, V. and Kalai, A., 2016. Quantifying and reducing stereotypes in word embeddings. arXiv preprint arXiv:1606.06121.

Hundreds of papers were published before this was openly discussed!

Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V. and Kalai, A.T., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349-4357).

This is because gender biases probably account for an increase in testing accuracy.

Recent years have brought extraordinary
advances in the technical domains of AI. Alongside such efforts, designers and researchers from a range of disciplines need to conduct what we call social-systems analyses of AI. They need to assess the impact of technologies on their social, cultural and political settings

--- There is a blind spot in AI research, Nature, 2016

The rise of the racist robots [New Statesman, 2016]

Racism is definitely bad PR!

[an investigation] found that the proprietary algorithms widely used by judges to help determine the risk of reoffending are almost twice as likely to mistakenly flag black defendants than white defendants [There is a blind spot in AI research]

Not just an economic problem

Discrimination and fairness

Attacking discrimination with smarter machine learning [Google Research Blog]

Something to worry about #2

Adversarial attacks

Can we break neural networks?

Notebook time!

Breaking linear classifiers on Imagenet

--- Andrej Karpathy blog

Fooling neural networks

Universal perturbations!

Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O. and Frossard, P., 2016. Universal adversarial perturbations. arXiv preprint arXiv:1610.08401.

Jia, R. and Liang, P., 2017. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328.

Something to worry about #3

Privacy

Anonymous data?

De Montjoye, Y.A., Radaelli, L. and Singh, V.K., 2015. Unique in the shopping mall: On the reidentifiability of credit card metadata. Science, 347(6221), pp.536-539.

Given access to a black-box classifier, can we infer whether a specific example was part of the training dataset?

We can with shadow training:

Shokri, R., Stronati, M., Song, C. and Shmatikov, V., 2017, May. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), (pp. 3-18). IEEE.

Privacy in distributed environments

Hitaj, B., Ateniese, G. and Perez-Cruz, F., 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. arXiv preprint arXiv:1702.07464.

Something to worry about #4

Security threats

Something to worry about #5

Hidden technical debt

DL is just a tiny component!

Hidden Technical Debt in Machine Learning Systems (NIPS 2015)

Machine learning offers a fantastically powerful toolkit for building useful complex prediction systems quickly. ... it is dangerous to think of these quick wins as coming for free. ... it is common to incur massive ongoing maintenance costs in real-world ML systems. [Risk factors include] boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.

Hidden Technical Debt in Machine Learning Systems (NIPS 2015)

If you are in Rome, check out our Meetup:

And our new association:

Italian Association for Machine Learning

http://www.iaml.it