Countering Algorithm Threat: Using curated training data to counteract bias in machine learning

Machine learning algorithms are like children, we have to teach them right from wrong

First: Who am I

@lizTheDeveloper

(yes I still get asked if I am a developer)

You might remember me from such startups as...

And now...

Also I wrote the fraud prevention system that protects your debit card

I'm a teacher πŸ‘©β€πŸ«

And a CTO πŸ‘©β€πŸ’»

And a mom πŸ‘©β€πŸ‘¦

An Aside:

https://slides.com/lizh/countering-algorithm-threat

Beginner-friendly resources are attached to many of these slides

My company is releasing a πŸ€– to help you be a better technical mentor!

Wanna beta test?Β 

bit.ly/enkibot

When some datasets have cross-validation used on them to train or validate models, you must train these algorithms with a reinforcement learning phase to properly calibrate a bias, because learning models produced by cross-validation are simple to perturb offensively

Use Inverse Reinforcement Learning, show the model strong examples of good, and strong examples of bad, so that it develops a deep bias.

Technical Thesis:

What's "machine learning"

Data has Features

What's "learning"

Apprenticeship Learning

https://dl.acm.org/citation.cfm?id=1015430

http://ai.stanford.edu/~ang/papers/icml00-irl.pdf

We have always curated training data

What do I mean by "Bias"

I mean...

not...

What's cross-validation used for?

Validating a model that is used for classifying things:

Earthquakes

Plant Species

Genes

Spam

Hate Speech

Predicting how you might engage with content:

Teach your kids right from wrong

Reward Mechanisms

#NotAllMetrics
(can be used as reward functions)

http://humanetech.com/app-ratings/

Content suggestion trains users as much as it is trained by them

The signals of the world do not average to the truth

https://medium.com/@2702rakesh/how-adversarial-attacks-work-in-machine-learning-cb9901141e20

https://arxiv.org/pdf/1710.08864.pdf

https://blog.openai.com/adversarial-example-research/

Adversarial Data

Consequences Abound

Offensive Perturbation is what they thought rock music was up to

Raise your kids, teach them right from wrong

or they will only tell us what they think most of us want to hear.

Countering Algorithm Threat: Using curated training data to remove bias in machine learning

By LizTheDeveloper

Countering Algorithm Threat: Using curated training data to remove bias in machine learning

Lesbians who Tech

  • 1,717