# what's behind facial recognition: technology and bias

University of Delaware

Department of Physics and Astronomy

federica bianco

Biden School of Public Policy and Administration

Data  Science Institute

## who needs to learn

Educate Policy makers

without understanding how ML works policy makers do not have the instruments to regulate it

Education for the people

but does this put the burden on the victims?

Educating DS practitioners in communicating DS concepts

the put the burden back on the practitioners

Datascience Education to Help and Protect us

Jack Dorsey (Twitter CEO) at TED 2019

boring the TED audience with details

Zuckerberg (Facebook CEO) deflecting questions at senate hearing

## machine learning models are

Epistemic transparency

Right to explanation: the scope of a general "right to explanation" is a matter of ongoing debate

# Democratised AI — The Black Box Problem

Accountability: who is responsible if an algorithm does harm

[Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed.

Arthur Samuel, 1959

what is a ML?

a model is a low dimensional representation of a higher dimensionality datase

what is a "model" in ML?

# what is machine learning?

[Machine Learning is the] field of study that gives computers the ability to learn without being explicitly programmed.

Arthur Samuel, 1959

model parameters: slope, intercept

data

mathematical formula

# what is machine learning?

ML: study, development, and applicaton of any model with parameters learnt from the data

# Objective Function

time

time

time

which is the "best fit" line? A , B, C, D?

A

B

C

D

# Objective Function

L_1 = \sum_{i=1}^N|f(x) - y|
L_2 = \sum_{i=1}^N(f(x) - y)^2

x1

x2

# Objective Function

L_1 = \sum_{i=1}^N|f(x) - y|
L_2 = \sum_{i=1}^N(f(x) - y)^2

x1

x4

## to select the best fit parameters we define a function of the parameters to minimize or maximize

Machine Learning models are parametrized representation of "reality"  where the parameters are learned from finite sets of realizations of that reality

(note: learning by instance, e.g. nearest neighbours, may not comply to this definition)

Machine Learning is the disciplines that conceptualizes, studies, and applies those models.

Key Concept

what is  machine learning?

used to:

• classify based on examples
• understand structure of feature space
• regression (classification with infinitely small classes)
• understand which features are important in prediction (to get close to causality)

# General ML usage

Accountability

Accountability

Explainability

Accountability

Explainability

Fairness

Fairness

Accountability

Explainability

Fairness

Fairness

Privacy and data rights

# What drives inference

Inference has different drievers in different times depending on the resources available and dominant culture

Accountability will look different in these different

Galileo Galilei 1610

Experiment driven

Enistein 1916

# inference

Theory driven | Falsifiability

Experiment driven

### @fedhere​

Ulam 1947

Theory driven | Falsifiability

Experiment driven

Simulations | Probabilistic inference | Computation

http://www-star.st-and.ac.uk/~kw25/teaching/mcrt/MC_history_3.pdf

# astronomy

the 2000s

Theory driven | Falsifiability

Experiment driven

Simulations | Probabilistic inference | Computation

Big Data + Computation | pattern discovery | predict by association

### @fedhere​

data driven: lots of data, drop theory and use associations

## algorithmic transparency

strictly policy issues:

proprietary algorithms + audability

### @fedhere​

technical + policy issues:

data access and redress + data provenance

## algorithmic transparency

https://www.darpa.mil/attachments/XAIProgramUpdate.pdf

trivially intuitive

decision trees

SVM

Random Forest

Deep Learning

Accuracy

univaraite

linear

regression

## algorithmic transparency

### @fedhere​

we're still trying to figure it out

## algorithmic transparency

https://www.darpa.mil/attachments/XAIProgramUpdate.pdf

trivially intuitive

decision trees

SVM

Random Forest

Deep Learning

Accuracy in solving complex problems

univaraite

linear

regression

## algorithmic transparency

### @fedhere​

we're still trying to figure it out

## algorithmic transparency

trivially intuitive

decision trees

Deep Learning

number of features that can be effectively included in the model

thousands

1

SVM

Random Forest

univaraite

linear

regression

https://www.darpa.mil/attachments/XAIProgramUpdate.pdf

## algorithmic transparency

### @fedhere​

Accuracy in solving complex problems

we're still trying to figure it out

## algorithmic transparency

trivially intuitive

univaraite

linear

regression

decision trees

Deep Learning

SVM

Random Forest

https://www.darpa.mil/attachments/XAIProgramUpdate.pdf

time

## algorithmic transparency

### @fedhere​

Accuracy in solving complex problems

we're still trying to figure it out

## algorithmic transparency

1

Machine learning: any method that learns parameters from the data

2

The transparency of an algorithm is proportional to its complexity and the complexity of the data space

3

The transparency of an algorithm is limited by our own ability and preparedness to interpret it

## algorithmic transparency

### @fedhere​

linear regression

## algorithmic transparency

A single tree model

## algorithmic transparency

1943

M-P Neuron McCulloch & Pitts 1943

M-P Neuron

Perceptrons are linear classifiers: makes its predictions based on a linear predictor function

combining a set of weights (=parameters) with the feature vector.

The perceptron algorithm : 1958, Frank Rosenblatt

x

y

1958

y ~= ~\sum_i w_ix_i ~+~ b

The perceptron algorithm : 1958, Frank Rosenblatt

+b
f
w_2
w_1
w_N

output

f

activation function

weights

w_i

bias

b

sigmoid

f
\sigma = \frac{1}{1 + e^{-z}}

.

.

.

x_1
x_2
x_N
y ~= f(~\sum_i w_ix_i ~+~ b)

Perceptrons are linear classifiers: makes its predictions based on a linear predictor function

combining a set of weights (=parameters) with the feature vector.

The perceptron algorithm : 1958, Frank Rosenblatt

Perceptron

The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.

The embryo - the Weather Buerau's \$2,000,000 "704" computer - learned to differentiate between left and right after 50 attempts in the Navy demonstration

# NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser

July 8, 1958

\vec{y} = f_N(....(f_1(\vec{x}{ W_i + b_1}...W_N + b_N)))

x1

x2

b1

b2

b3

b

w11

w12

w13

w21

0

w22

w23

# multilayer perceptron

w: weight

sets the sensitivity of a neuron

b: bias:

up-down weights a neuron

# EXERCISE

output

how many parameters?

input layer

hidden layer

output layer

hidden layer

how it works

## facial recognition

typically 68 landmarks

how it works

## facial recognition

typically 68 landmarks

how it works

## facial recognition

what its not:

Autoencoders

What do NN do? approximate complex functions with series of linear functions

.... so if my layers are smaller what I have is a compact representation of the data

Autoencoder Architecture

Feed Forward DNN:

the size of the input is 5,

the size of the last layer is 2

Unsupervised learning with

Neural Networks

What do NN do? approximate complex functions with series of linear functions

To do that they extract information from the data

Each layer of the DNN produces a representation of the data a "latent representation" .

The dimensionality of that latent representation is determined by the size of the layer (and its connectivity, but we will ignore this bit for now)

.... so if my layers are smaller what I have is a compact representation of the data

• Encoder: outputs a lower dimensional representation z of the data x (similar to PCA, tSNE...)
• Decoder: Learns how to reconstruct x given z: learns p(x|z)

Building a DNN

with keras and tensorflow

Trivial to build, but the devil is in the details!

Building a DNN

with keras and tensorflow

Trivial to build, but the devil is in the details!

from keras.models import Sequential
#can upload pretrained models from keras.models
from keras.layers import Dense,  Conv2D, MaxPooling2D
#create model
model = Sequential()

#create the model architecture by adding model layers

#need to choose the loss function, metric, optimization scheme

#need to learn what to look for - always plot the loss function!
model.fit(x_train, y_train, validation_data=(x_test, y_test),
epochs=20, batch_size=100, verbose=1)
#note that the model allows to give a validation test,
#this is for a 3fold cross valiation: train-validate-test
#predict
test_y_predictions = model.predict(validate_X)

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

encoder

This autoencoder model has a 64-neuron bottle neck. This means it will generate a compressed representation of the data out of that layer which is 16-dimensional (the original size is 784 pixels)

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

This autoencoder model has a 64-neuron bottle neck. This means it will generate a compressed representation of the data out of that layer which is 16-dimensional (the original size is 784 pixels)

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

decoder

This autoencoder model has a 64-neuron bottle neck. This means it will generate a compressed representation of the data out of that layer which is 16-dimensional (the original size is 784 pixels)

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

This autoencoder model has a 64-neuron bottle neck. This means it will generate a compressed representation of the data out of that layer which is 16-dimensional (the original size is 784 pixels)

bottle neck

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

This simple odel has 200000 parameters!

My original choice is to train it with "adadelta" with a mean squared loss function, all activation functions are relu, appropriate for a linear regression

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

What should I choose for the loss function and how does that relate to the activation functiom and optimization?

Building a DNN

with keras and tensorflow

autoencoder for image recontstruction

What should I choose for the loss function and how does that relate to the activation functiom and optimization?

loss good for activation last layer size last layer
mean_squared_error regression linear one node
mean_absolute_error regression linear one node
mean_squared_logarithmit_error ​regression linear one node
binary_crossentropy binary classification sigmoid one node
categorical_crossentropy multiclass classification sigmoid N nodes
Kullback_Divergence multiclass classification, probabilistic inerpretation sigmoid N nodes

# resources

Neural Network and Deep Learning

an excellent and free book on NN and DL

http://neuralnetworksanddeeplearning.com/index.html

Deep Learning An MIT Press book in preparation

Ian Goodfellow, Yoshua Bengio and Aaron Courville

https://www.deeplearningbook.org/lecture_slides.html

History of NN

https://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks/History/history2.html

## models are neutral, the bias is in the data (or is it?)

Why does this AI model whitens Obama face?

Simple answer: the data is biased. The algorithm is fed more images of white people

## models are neutral, the bias is in the data (or is it?)

Why does this AI model whitens Obama face?

Simple answer: the data is biased. The algorithm is fed more images of white people

But really, would the opposite have been acceptable? The bias is in society

Joy Boulamwini

## accountability

• can scientists be held responsible?
• should whoever commissions be responsible?
• is nobody responsible under the premise that decisions are objective? -> are they objective?, what does objective mean?, how can we objectively measure objectivity

## accountability

because of its complexity FR is cmmissioned to specialists

## accountability

In a press release, the ACLU wrote, “Mr. Williams’ experience was the first case of wrongful arrest due to facial recognition technology to come to light in the United States.”

## accountability

In a press release, the ACLU wrote, “Mr. Williams’ experience was the first case of wrongful arrest due to facial recognition technology to come to light in the United States.”

Who is responsible for setting the threshold?

FR returns a probabilistic result

a threshold is chosen to turn it into a T/F match for decision making

Decide which model is appropriate (depends on data and question)

## where is the bias?

### @fedhere​

1 - model selection

we are still trying to figure it out

we are still trying to figure it out

trivially intuitive

decision trees

SVM

Random Forest

Deep Learning

Accuracy

univaraite

linear

regression

## where is the bias?

Decide what your target function is

Machine learning models are functions that "learn" their parameters from the data.

They "learn" by minimizing or maximize some quantity.

What should you minimize?

### @fedhere​

https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-41a5d11f5220

2 - cost function

## where is the bias?

They "learn" by minimizing or maximize some quantity.

What should you minimize?

the hypothetical trolley problem suddenly is real

self-driving cars

### @fedhere​

2 - cost function

## where is the bias?

They "learn" by minimizing or maximize some quantity.

What should you minimize?

prosecutorial justice

minimize number of people incarcerated unjustly

maximize public safety

### @fedhere​

OR

2 - cost function

Explore the data

discover some of the bias

(trust me, there is more!)

it's not easy

there's covariance

missing data

## where is the bias?

### @fedhere​

3 - data selection and preparation

remove the bias...

(few try)

## where is the bias?

3 - data selection and preparation

Machine learning learns from examples... what if the examples are ... racist?

GPT-3

## unethical applications of FR

https://modelviewculture.com/pieces/the-hidden-dangers-of-ai-for-queer-and-trans-people

Text

## unethical applications of FR

social protests

https://www.washingtonpost.com/technology/2020/06/12/facial-recognition-ban/

Policy and Resistance

https://www.washingtonpost.com/technology/2021/02/17/facial-recognition-biden/

https://www.washingtonpost.com/technology/2019/05/22/blasting-facial-recognition-technology-lawmakers-urge-regulation-before-it-gets-out-control/

# key concepts

MACHINE LEARNING

• Machine Learning models are parametrized representation of "reality"  where the parameters are learned from finite sets of realizations of that reality
• Unsupervised learning: all variables observed for all data, looking for natural grouping of datapoints in the N-dim space
• Supervised learning: a target variable is known for (a subset of) the data and the goal is to predict it for new (the rest of the) data

DATA ETHICS

• epistemic transparency:not all models are the same
• there is a tradeoff between epistemic transparency and the ability to handle complex data
• The bias enter data science in (at least) data; model selection; target function and optimization choices; validation

Text

thank you!

### @fedhere​

University of Delaware

Department of Physics and Astronomy

federica bianco

Biden School of Public Policy and Administration

Data  Science Institute

fbianco@udel.edu

https://www.tandfonline.com/doi/full/10.1080/1369118X.2018.1477967

https://medium.com/payoff/transparency-in-data-science-9a8778083b3

https://www.americanscientist.org/article/a-peek-at-proprietary-algorithms

https://medium.com/payoff/transparency-in-data-science-9a8778083b3

https://weaponsofmathdestructionbook.com/

The Ethics of the Ethics of AI https://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780190067397.001.0001/oxfordhb-9780190067397-e-2 , Thomas M. Powers and Jean-Gabriel Ganascia

#### ethicsds_bianco

By federica bianco

• 731