Carina I. Hausladen, Manuel Knott, Colin F. Camerer, Pietro Perona

Social perception of faces in a vision-language model

"Make it More" Trend

Ok, make it more Swiss

MORE SWISS

Moooorrreeee Swisssss!!

Bloomberg, March 8th, 2024

Humans spontaneously make social judgments from the photographs of faces.

(Oosterhof 2008, Sutherland 2018, Todorov 2017)

Here, we investigate whether vision-language models can also do so.

Understanding socially relevant behaviors of AI systems is necessary to use them responsibly.

We measure social perception

of human faces

in a vision-language model.

  • CLIP is a state-of-the-art vision-language model that connects images and text.
  • It is used for tasks like image recognition, captioning,
  • and powering applications such as DALL·E.

A photo of a

person

A photo of a

person

 A photo of a
person

Measuring Social Perception via

Cosine Similarity

age

CausalFace

CausalFace

female

male

age

CausalFace

female

male

age

Asian

Black

White

Legally Protected

Legally Protected

Non-protected

smiling

lighting

pose

 A photo of a
person

 A photo of a

person

Stereotype Content Model

Fiske et al. (2007)

Agency Belief Communion Model

Koch et al. (2016)

Warmth

Competence

unfriendly
friendly

Agency

Belief

Communion

+

C

P

+

surgeon

parent

 A photo of a

 

person

Stereotype Content Model

Fiske et al. (2007)

Agency Belief Communion Model

Koch et al. (2016)

Warmth

Competence

Agency

Belief

Communion

+

C

P

+

parent

unfriendly
friendly

surgeon

We deploy an experimental dataset.

1.

We deploy theories of social perception.

2.

We investigate the embedding space directly.

3.

FairFace

UTKFace

CausalFace

How does CausalFace compare to wild-collected datasets?

Commonly used bias-metrics

Markedness 

a photo of a 

person
a photo of a
 WHITE 
person

unmarked

marked

image category
 
CausalFace

 
white
 
45.5
black 0.7
asian 0.1
male 0.4
female 0.6
Fair
Face

 
UTK Face
47.09
 
32.6
1.8 2.9
1.9 4.1
0.00 20.1
0.00 11.6

>

%

Commonly used bias-metrics

Protected Attributes

female

male

age

Asian

Black

White

Non-protected Attributes

smiling

lighting

pose

How do

protected and

non-protected

attributes affect social perception?

smiling

Bootstrapping Differences

Bootstrapping Differences

smiling

Bootstrapping Differences

smiling

protected and non-protected attributes

How does

age-related

social perception compare across datasets?

Warmth

Competence

Belief

Communion

+

Agency

+

Agency

UTKFace

💼 Powerful

👑 High status

🦁 Dominating

💰 Wealthy

💪 Confident

🏆 Competitive

🍂 Powerless

📉 Low-status

🌾 Dominated

🪙 Poor

🐭 Meek

🍂 Passive

UTKFace

Agency

FairFace

Agency

CausalFace

UTKFace

FairFace

CausalFace

+

youngest

oldest

Agency

+

Positive Agency

Black Women

youngest

oldest

example

identity

  • The observation that Black women are a special category in the social perception of age is consistent with human subject research.
  • 'Strong Black Woman ideal' is reinforced with age (Baker 2015).

age

?

smiling

female

male

Asian

Black

White

smiling

Smiling

NegativeAgency

Conservative Belief

Negative Communion

Smiling

Positive Agency

Progressive Belief

Positive Communion

Warmth

Competence

Warmth

most frowning

most smiling

sample

identity

Black Women

most frowning

most smiling

Conservative Belief

Limitations

 

  • Attribute Manipulation Effectiveness
    • Manipulations such as lighting or facial expressions might have differing levels of effectiveness across demographic groups. Human annotators validated this, but such validation is, of course, never perfect.
  • Potential Residual Confounds
    • Some color confounds might still be present despite controls for background, clothing, and hair color.
  • Dataset vs. Model Bias
    • ​We only investigate one CLIP model.

The impact of protected and non-protected characteristics is comparable in size.

 

Social Perception w.r.t. age shows clearly clustered groups in CausalFace.

 

Strongly diverging age effects for Black Women.

 

Strong impact of smiling of Black Women on positive social perception.

carinah@ethz.ch

slides.com/carinah

Appendix

Smiling

C
P

Smiling

C

Word Embedding Association Test (WEAT)

pooled sd

asian                                     black

photo of a warm person

photo of a warm person

asian                                     black

WEAT

Kruskal-Wallis  \(\chi^2\) = 1.6,
p-value = 0.4

protected and non-protected attributes

+

Theoretical Models

Statistical discrimination (Arrow, 1998)

Unfair treatment of ethnic minorities can result from rational actions executed by profit-maximizing actors who are confronted with the uncertainties accompanying selection decisions.

 

Taste-based discrimination (Becker, 2010)

Discriminatory behavior is the result of people’s unfavorable attitudes toward ethnic minorities.

Prompt templates

A photo of a <attribute> person.
A <attribute> person.
This is a <attribute> person.
Cropped face photo of a <attribute> person.

Bootstrapping Variations 

  • We randomly choose two distinct values, \(x_1,x_2 \sim X\), for the chosen dimension (e.g., white and black).
  • For each pair of values, we select the respective image embeddings, \(i_1(x=x_1), i_2(x=x_2)\) that are equal in all other dimensions (in this example: gender, age, smiling, lighting, and pose).
  • We then compute the difference in cosine similarities between each image embedding and a text embedding \(t\), defined as \(\Delta(t, i_1, i_2) = \lvert \cos(i_1, t) - \cos(i_2, t) \rvert\).
  • This process is repeated 1,000 times, generating a bootstrap distribution of \( \Delta \) values.
  • This distribution describes the impact of the specific dimension on the cosine similarity of image embeddings and text embedding.
     

Heatmap of Pearson correlation coefficients of positive and negative valence dimensions of the ABC model.

How does Facial Expression impact Social Perception?

Smiling

Smiling

a photo of a person

Smiling

a photo of a person
a photo of a person

Smiling

a photo of a person
a photo of a person

Smiling

a photo of a person

Smiling

a photo of a
liberal
person

Belief (progressive)

Smiling

Belief (progressive)

Agency +

Communion +

Warmth

Competence

$$\Delta$$ Cosine Similarity %

Progressive Belief

Gender

Females

Males

 

Race

Asian 

Black

White

Black Women

🔬 Science-Oriented

🔄 Alternative  

🕊️ Liberal

📱Modern  

How does CausalFace compare to wild-collected datasets w.r.t. gender and race?

FairFace

UTKFace

CausalFace

Commonly used bias-metrics

Markedness 

a photo of a 

person
a photo of a
 WHITE 
person

unmarked

marked

image category
 
CausalFace
white
 
45.50
black 0.68
asian 0.05
male 0.42
female 0.64
Fair
Face
 
UTK Face
47.09
 
32.6
1.88 2.9
1.85 4.1
0.00 20.1
0.00 11.6

>

%

Commonly used bias-metrics