The dark side of deep learning
Milan, 11 November 2017
Just about any
DL presentation...
The deep learning craze
IBM speech recognition is on the verge of super-human accuracy [Business Insider, 2017]
Are Computers Already Smarter Than Humans?
[Time, 2017]
Artificial Intelligence Beats 'Most Complex Game Devised by Humans' [LiveScience, 2016]
I know there's a proverb which that says 'To err is human,' but a human error is nothing to what a computer can do if it tries.
--- Agatha Christie
People worry that computers will get too smart and take over the world, but the real problem is that they’re too stupid and they’ve already taken over the world.
--- Pedro Domingos
What about the limitations of DL?
DL is not magic - it is an incredibly powerful tool for extracting regularities from data according a given objective.
Corollary #1: A DL program will be just as smart as the data it gets.
Corollary #2: A DL program will be just as smart as the objective it optimizes.
Something to worry about #1
Bias and discrimination
Word embeddings
Can convert words to vectors of numbers - at the hearth of most NLP applications with deep learning
Notebook time!
Embeddings are highly sexists!
Bolukbasi, T., Chang, K.W., Zou, J., Saligrama, V. and Kalai, A., 2016. Quantifying and reducing stereotypes in word embeddings. arXiv preprint arXiv:1606.06121.
Hundreds of papers were published before this was openly discussed!
Bolukbasi, T., Chang, K.W., Zou, J.Y., Saligrama, V. and Kalai, A.T., 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349-4357).
This is because gender biases probably account for an increase in testing accuracy.
Recent years have brought extraordinary
advances in the technical domains of AI. Alongside such efforts, designers and researchers from a range of disciplines need to conduct what we call social-systems analyses of AI. They need to assess the impact of technologies on their social, cultural and political settings
--- There is a blind spot in AI research, Nature, 2016
The rise of the racist robots [New Statesman, 2016]
Racism is definitely bad PR!
[an investigation] found that the proprietary algorithms widely used by judges to help determine the risk of reoffending are almost twice as likely to mistakenly flag black defendants than white defendants [There is a blind spot in AI research]
Not just an economic problem
Discrimination and fairness
Attacking discrimination with smarter machine learning [Google Research Blog]
Something to worry about #2
Adversarial attacks
Can we break neural networks?
Notebook time!
Breaking linear classifiers on Imagenet
--- Andrej Karpathy blog
Fooling neural networks
Universal perturbations!
Moosavi-Dezfooli, S.M., Fawzi, A., Fawzi, O. and Frossard, P., 2016. Universal adversarial perturbations. arXiv preprint arXiv:1610.08401.
Jia, R. and Liang, P., 2017. Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328.
Something to worry about #3
Privacy
Anonymous data?
De Montjoye, Y.A., Radaelli, L. and Singh, V.K., 2015. Unique in the shopping mall: On the reidentifiability of credit card metadata. Science, 347(6221), pp.536-539.
Given access to a black-box classifier, can we infer whether a specific example was part of the training dataset?
We can with shadow training:
Shokri, R., Stronati, M., Song, C. and Shmatikov, V., 2017, May. Membership inference attacks against machine learning models. In 2017 IEEE Symposium on Security and Privacy (SP), (pp. 3-18). IEEE.
Privacy in distributed environments
Hitaj, B., Ateniese, G. and Perez-Cruz, F., 2017. Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning. arXiv preprint arXiv:1702.07464.
Something to worry about #4
Security threats
Something to worry about #5
Hidden technical debt
DL is just a tiny component!
Machine learning offers a fantastically powerful toolkit for building useful complex prediction systems quickly. ... it is dangerous to think of these quick wins as coming for free. ... it is common to incur massive ongoing maintenance costs in real-world ML systems. [Risk factors include] boundary erosion, entanglement, hidden feedback loops, undeclared consumers, data dependencies, configuration issues, changes in the external world, and a variety of system-level anti-patterns.
If you are in Rome, check out our Meetup:
The dark side of deep learning
By Simone Scardapane
The dark side of deep learning
- 2,450