dr.federica bianco | fbb.space | fedhere | fedhere
On the dangers of (generative) AI
and society
November 30, 2022
will be made available to developers through Google Cloud’s API from December 13
Vinay Prabhu exposes racist bias in GPT-3
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
The past 3 years of work in NLP have been characterized by the development and deployment of ever larger language models, especially for English. BERT, its variants, GPT-2/3, and others, most recently Switch-C, have pushed the boundaries of the possible both through architectural innovations and through sheer size. Using these pretrained models and the methodology of fine-tuning them for specific tasks, researchers have extended the state of the art on a wide array of tasks as measured by leaderboards on specific benchmarks for English. In this paper, we take a step back and ask: How big is too big? What are the possible risks associated with this technology and what paths are available for mitigating those risks? We provide recommendations including weighing the environmental and financial costs first, investing resources into curating and carefully documenting datasets rather than ingesting everything on the web, carrying out pre-development exercises evaluating how the planned approach fits into research and development goals and supports stakeholder values, and encouraging research directions beyond ever larger language models.
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
We have identified a wide variety of costs and risks associated with the rush for ever larger LMs, including:
environmental costs (borne typically by those not benefiting from the resulting technology);
financial costs, which in turn erect barriers to entry, limiting who can contribute to this research area and which languages can benefit from the most advanced techniques;
opportunity cost, as researchers pour effort away from directions requiring less resources; and the
risk of substantial harms, including stereotyping, denigration, increases in extremist ideology, and wrongful arrest, should humans encounter seemingly coherent LM output and take it for the words of some person or organization who has accountability for what is said.
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
When we perform risk/benefit analyses of language technology, we must keep in mind how the risks and benefits are distributed, because they do not accrue to the same people. On the one hand, it is well documented in the literature on environmental racism that the negative effects of climate change are reaching and impacting the world’s most marginalized communities first [1, 27].
Is it fair or just to ask, for example, that the residents of the Maldives (likely to be underwater by 2100 [6]) or the 800,000 people in Sudan affected by drastic floods pay the environmental price of training and deploying ever larger English LMs, when similar large-scale models aren’t being produced for Dhivehi or Sudanese Arabic?
While the average human is responsible for an estimated 5t CO2 per year, the authors trained a Transformer (big) model [136] with neural architecture search and estimated that the training procedure emitted 284t of CO2.
[...]
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
4.1 Size Doesn’t Guarantee Diversity The Internet is a large and diverse virtual space, and accordingly, it is easy to imagine that very large datasets, such as Common Crawl (“petabytes of data collected over 8 years of web crawling”, a filtered version of which is included in the GPT-3 training data) must therefore be broadly representative of the ways in which different people view the world. However, on closer examination, we find that there are several factors which narrow Internet participation [...]
Starting with who is contributing to these Internet text collections, we see that Internet access itself is not evenly distributed, resulting in Internet data overrepresenting younger users and those from developed countries [100, 143]. However, it’s not just the Internet as a whole that is in question, but rather specific subsamples of it. For instance, GPT-2’s training data is sourced by scraping outbound links from Reddit, and Pew Internet Research’s 2016 survey reveals 67% of Reddit users in the United States are men, and 64% between ages 18 and 29. Similarly, recent surveys of Wikipedians find that only 8.8–15% are women or girls [9].
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
4.3 Encoding Bias It is well established by now that large LMs exhibit various kinds of bias, including stereotypical associations [11, 12, 69, 119, 156, 157], or negative sentiment towards specific groups [61]. Furthermore, we see the effects of intersectionality [34], where BERT, ELMo, GPT and GPT-2 encode more bias against identities marginalized along more than one dimension than would be expected based on just the combination of the bias along each of the axes [54, 132].
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, Shmargaret Shmitchell
The ersatz fluency and coherence of LMs raises several risks, precisely because humans are prepared to interpret strings belonging to languages they speak as meaningful and corresponding to the communicative intent of some individual or group of individuals who have accountability for what is said.
and society
autoencoder for image recontstruction
A more ambitious model has a 16 neurons bottle neck: we are trying to extract 16 numbers to reconstruct the entire image! its pretty remarcable! those 16 number are extracted features from the data
predicted
original
latent
representation
Why does this AI model whitens Obama face?
Simple answer: the data is biased. The algorithm is fed more images of white people
Why does this AI model whitens Obama face?
Simple answer: the data is biased. The algorithm is fed more images of white people
But really, would the opposite have been acceptable? The bias is in society
The bias is in the data
The bias is in the models and the decision we make
The bias is in how we choose to optimize our model
The bias is society that provides the framework to validate our biased models
none of this is new
https://www.nytimes.com/2019/04/25/lens/sarah-lewis-racial-bias-photography.html
Joy Boulamwini
Neural Network and Deep Learning
an excellent and free book on NN and DL
http://neuralnetworksanddeeplearning.com/index.html
Deep Learning An MIT Press book in preparation
Ian Goodfellow, Yoshua Bengio and Aaron Courville
https://www.deeplearningbook.org/lecture_slides.html
History of NN
https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks/History/history2.html