⚠️ WARNING ⚠️

Listening to this lightning talk may result in any or more of the following:

psychological trauma
existential crisis
spectral agony
neuroblasphemy
recalibration of life goals

Infohazards

Risk that arises from the dissemination of true information that may cause harm or enable some agent to cause harm.

Knowing about a murder makes you an accessory to that muuurdaaaah.

Spoiler Alert: Snape kills Dumbledore.

A thought experiment, where a future superintelligent AI would be motivated to torture anyone who didn't help, indirectly or directly, bring it existence as early as possible.*

* While considering itself to be not evil 😬

Roko's Basilisk

user: Roko, July 2010

Decision Theory

Prisoner's Dilemma

🧁 (🧁)	🎂 (❌)
❌ (🎂)	🍰 (🍰)

⚔️

💖

⚔️

CDT fails when other agents can perceive how you will react and react accordingly. *

Casual Decision Theory, aka 'being rational':

"Do what's best for me with what info I have"

*Which is every real world decision?!? 😬😬😬

You

How we are building

AIs currently, oof

😬

🏛️

Timeless Decision Theory could to solve this.

You fool! You thought that I thought that you thought that I thought that you thought that I thought that you thought....

Newcombian Problems

A situation where a 'rational' agent and a

'human-like' agent pick opposite strategies.

This is bad for humans when trying to make

Human-Friendly AIs.

(I shared another newcombian problem in #provoking-thoughts earlier)

We don't know where or how Newcombian Problems might come up.

Timeless Decision Theory could to solve this.

You fool! You thought that I thought that you thought that I thought that you thought that I thought that you thought....

Now

Eventual Future

CDT

TDT

1. Superintelligence is inevitable.

2. There will only ever be one.

Now

Eventual Future

Deeper dive on Roko's Basilisk

I don't usually talk like this, but I'm going to make an exception for this case. Listen to me very closely, you idiot.

YOU DO NOT THINK IN SUFFICIENT DETAIL ABOUT SUPERINTELLIGENCES CONSIDERING WHETHER OR NOT TO BLACKMAIL YOU. THAT IS THE ONLY POSSIBLE THING WHICH GIVES THEM A MOTIVE TO FOLLOW THROUGH ON THE BLACKMAIL.

- Eliezer Yudkowsky

Creator of TDT, Founder of LessWrong and Machine Intelligence Research Institute

Erased & Banned

Again, I deleted those posts not because I had decided that [Roko's Basilisk] presented a real hazard, but because I was afraid some unknown variant of it might, and because it seemed to me what the obvious General-Procedure-For-Handling-Things-That-Might-Be-Infohazards is; Roko recklessly ignored that and promptly posted his idea on the internet.

- Eliezer Yudkowsky

Years later...

If information that is deemed dangerous or taboo is more likely to be spread rapidly,

How will we protect ourselves from serious infohazards?

end.

sorry, I had no choice.

Information Hazards

By Scott Tolksdorf

⚠️ WARNING ⚠️

Infohazards

Roko's Basilisk

Decision Theory

Prisoner's Dilemma

Timeless Decision Theory could to solve this.

Newcombian Problems

end.

Information Hazards

More from Scott Tolksdorf