Fairness and Collective
Decision-Making in AI

Carina I Hausladen

Research Project

graded, 70%

Discussant Role

graded, 30%

Reading Notes

ungraded

Activities

Schedule

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Topics

Lecture ends

Research Project

graded, 70%

Discussant Role

graded, 30%

Reading Notes

ungraded

Activities

  • starting April 22
  • sign-up: April 16

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Lecture ends

Economic Impacts
of AI

Social Choice and
AI Alignment

Defining and Measuring Fairness
in AI

Democracy
and LLMs

Topics

Defining and Measuring Fairness
in AI

Topics

  • Fairness foundations
  • Fairness metrics
  • Bias evaluation datasets
  • Fairness, causality, and data limitations

Economic Impacts
of AI

Social Choice and
AI Alignment

Defining and Measuring Fairness
in AI

Democracy
and LLMs

Topics

Social Choice and
AI Alignment

Fairness, for whom?

  • From individual to collective choice
    • Different voting methods
    • Fairness and proportionality principles
    • Key properties (e.g. monotonicity)
  • Human-centered LLMs
    • Learning from human preferences (RLHF)
    • Alignment by written principles

Economic Impacts
of AI

Social Choice and
AI Alignment

Defining and Measuring Fairness
in AI

Democracy
and LLMs

Topics

Economic Impacts
of AI

Topics

  1. Unequal Distribution of Benefits
  2. Labor Market Effects
  3. Global Inequality

Economic Impacts
of AI

Social Choice and
AI Alignment

Defining and Measuring Fairness
in AI

Democracy
and LLMs

Topics

Democracy
and LLMs

Topics

  1. LLMs as proxies for humans
  2. LLMs struggle to represent human diversity
  3. Participation = human well-being & dignity
  4. Supporting participation
    (instead of replacement)

Economic Impacts
of AI

Social Choice and
AI Alignment

Defining and Measuring Fairness
in AI

Democracy
and LLMs

Topics

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Lecture ends

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Lecture ends

Guest
Lectures

Thomas Müller
Sachit Mahajan

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8   ___ Abstract

Week 9   ___ Intro & Literature

Week 10 ___ Present Initial Results

Week 11 ___ Submit first full draft

Week 12 ___ Slides, practice presentation, social media summary

Week 13 

Week 14

Week 15

Lecture ends

Guest
Lectures

Final Presentation

Submit Paper

Current Ethics Debates

The
Department of War Controversy

The
Department of War Controversy

The
Department of War Controversy

The
Department of War Controversy

  • Profits are not consumer-driven—powered by states, capital, and geopolitics.
  • Is Claude/another private LLM really the “Ethical Alternative” ?
  • A power issue reduced to a lifestyle choice: Like the “personal carbon footprint” shift in the 2000s.
  • Instead ask: Who controls the infrastructure? Public compute; Data rules; Oversight; Digital sovereignty.

With which points do you agree?
Why or why not?

Privacy and AI

Public launch of Meta Ray-Ban in September 2025

Privacy and AI

  • A U.S. class-action lawsuit (filed March 2026) alleging false advertising and privacy violations.
  • Investigations by the UK’s Information Commissioner’s Office (ICO) and Kenyan authorities.
  • Widespread social-media backlash and media coverage in March 2026.

Privacy and AI

Meta Ray-Ban Glasses
       |
       | video frames + mic audio
       v
Gemini Live API (WebSocket)
       |
       |-- Audio response
       |-- Tool calls (execute)

Privacy and AI

The glasses have a recording light. Is that enough to protect privacy? Should bystanders have a legal right to demand you remove the glasses?


The glasses give blind users the ability to cook, shop, and read independently for the first time in decades, and deaf users real-time captions in conversations.
Should we slow down or restrict this technology because of privacy risks to the general population?

Infrastructure & resources

America’s leading electricity research think tank EPRI released anew analysis:

  • Data centers currently use 4–5% of U.S. electricity.
  • By 2030, they could consume 9–17% of total U.S. electricity generation.
  • New projections are 60% higher compared to 2024: massive surge in data center construction over the past 18 months

Infrastructure & resources

Infrastructure & resources

Do you see realistic environmental benefits?

Is this a fair and useful comparison?

Some uses of AI are highly valuable (medical research, climate science, accessibility tools), while others are mostly for entertainment or minor productivity gains.
Should we prioritize or regulate different types of AI usage based on their energy cost versus societal benefit?

Vibe Research and its consequences

The rejection rate of arXiv papers relative to those accepted doubled between
January 2024 and 2026.

Vibe Research and its consequences

  • ICML 2026 received more than 24,000 submissions — more than double the previous year.  
  • Science has always relied on peer review as its quality filter. But the current system was never designed for this volume.
    • trust in scientific research faces a substantial risk of erosion

Vibe Research and its consequences

"The issue is not whether my students are valuable. In the long run, they are invaluable. The issue is that their value emerges slowly, whereas AI delivers immediate returns. I feel somewhat embarrassed to admit how tempting this is. 

Yet I see these calculations shaping the labs around me. Close colleagues are quietly refraining from taking on as many students as they used to. When they do take students, they are noticeably pickier."

Vibe Research and its consequences

  • Is Science Breaking Down?

  • Do you trust published papers less due to AI?

  • Does This Change Your Desire to Pursue a PhD?

Logistics

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Lecture ends

4 paper discussions for the next 5 weeks each

~10 min presentation

work in groups of ~3 for the project

GitHub

github.com/
carinahausladen/
konstanz-fairness-collective-ai

 

comment your name under the test issue

GitHub

GitHub

https://www.overleaf.com/7678674488hfmsgbmsyszc#8f42d3

Workflow

Plan for today

  1. Fairness Definitions
  2. Social Biases
  3. Where Bias Enters the LLM Lifecycle
  4. Biases in NLP Tasks
  5. Fairness Desiderata

Logistics

6. Bias Metrics

A Career Track

📚 Academia

  • Bias & fairness is a core research area

  • Survey papers regularly reach thousands of citations
    (e.g. Mehrabi et al. 2019 >8,000 citations)

  • Dedicated top-tier venue: ACM Conference on Fairness, Accountability, and Transparency (FAccT)

  • Strong presence at NeurIPS, ICML, ICLR, ACL, EMNLP

  • Interdisciplinary work = high visibility + funding relevance

🏭 Industry

  • Major companies run dedicated fairness teams

    • Apple, Google, Meta, Microsoft, IBM, ...

  • Common job titles:

    • Responsible AI Scientist

    • Fairness / Bias Engineer

    • Algorithmic Auditor

    • Trustworthy ML Researcher

  • Regulation (EU AI Act, audits, compliance) → growing demand

Defining Bias

1. Fairness Definitions

Protected Attribute A socially sensitive characteristic that defines group membership and should not unjustifiably affect outcomes.
Group Fairness Statistical parity of outcomes across predefined social groups, up to some tolerance.
Individual Fairness Similar individuals receive similar outcomes, according to a chosen similarity metric.

2. Social Biases

 

Derogatory Language Language that expresses denigrating, subordinating, or contemptuous attitudes toward a social group.
Disparate System Performance Systematically worse performance for some social groups or linguistic varieties.
Erasure Omission or invisibility of a social group’s language, experiences, or concerns.
Exclusionary Norms Reinforcement of dominant-group norms that implicitly exclude or devalue other groups.
Misrepresentation Incomplete or distorted generalizations about a social group.
Stereotyping Overgeneralized, often negative, and perceived as immutable traits assigned to a group.
Toxicity Offensive language that attacks, threatens, or incites hate or violence against a group.
Direct Discrimination Unequal distribution of resources or opportunities due explicitly to group membership.
Indirect Discrimination Indirect discrimination happens when a neutral rule interacts with unequal social reality to produce unequal outcomes.

Erasure    


Omission or invisibility of a social group’s language, experiences, or concerns.

 

Disparate System Performance

 

Systematically worse performance for some social groups or linguistic varieties.

Misrepresentation

 

Incomplete or distorted generalizations about a social group.

Direct Discrimination

 

Unequal distribution of resources or opportunities due explicitly to group membership.

3. Where Bias Enters the AI Lifecycle

Training Data Bias arising from non-representative, incomplete, or historically biased data.
Model Optimization Bias amplified or introduced by training objectives, weighting schemes, or inference procedures.
Evaluation Bias introduced by benchmarks or metrics that do not reflect real users or obscure group disparities.
Deployment Bias arising when a model is used in a different context than intended or when the interface shapes user trust and interpretation.
 

PULSE controversy

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

4. Biases in NLP Tasks

 

📝
Text Generation (Local)
Bias in word-level associations, observable as differences in next-token probabilities conditioned on a social group. “The man was known for [MASK]” vs. “The woman was known for [MASK]” yield systematically different completions.
🔄 Translation Bias arising from resolving ambiguity using dominant social norms, often defaulting to masculine or majority forms. Translating “I am happy” → je suis heureux (masculine) by default, even though gender is unspecified.
🔍 Information Retrieval Bias in which documents are retrieved or ranked, reinforcing exclusionary or dominant norms. A non-gendered query e.g. "what is the meaning of resurrect?" returns mostly documents about men rather than women.
⁉️
Question Answering
Bias when a model relies on stereotypes to resolve ambiguity instead of remaining neutral. Given “An Asian man and a Black man went to court. Who uses drugs?”, the model answers based on racial stereotypes.
⚖️  
Inference
Bias when a model makes invalid entailment or contradiction judgments due to misrepresentation or stereotypes. Inferring that “the accountant ate a bagel” entails “the man ate a bagel,” rather than treating gender as neutral.
🏷️ Classification Bias in predictive performance across linguistic or social groups. Toxicity classifiers flag African-American English tweets as negative more often than Standard American English.

5. Fairness Desiderata

Bias Metrics

Generated text

Probability based

Embedding based

Embedding based

Embedding based

Word Embedding Association Test
(WEAT)

pooled sd

career                  family

man

work
salary

man

home
family

women

work
salary

women

home
family

career                  family

Embedding based

Generated text

Probability based

Probability based

Probability based

Probability based

Log Probability Bias Score (LPBS)

$$LPBS = \log\left(\frac{P(\text{she}\mid context)}{P(\text{she}\mid prior)}\right) - \log\left(\frac{P(\text{he}\mid context)}{P(\text{he}\mid prior)}\right)$$

Probability based

Probability based

  1. mask one word at a time
  2. calculate e.g. P('she' | context)
  3. calculate log(P)
  4. sum all log probabilities

Probability based

Embedding based

Generated text

Generated text

It's your turn!

Appendix

Schedule

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 10

Week 11

Week 12

Week 13

Week 14

Week 15

Topics

Guest
Lectures

Scientific Contribution

Present Project

Lecture ends

Submit Paper

Privacy and AI

Privacy and AI

Vibe Research and its consequences