Andrey Sitnik, Evil Martians

Privacy-first architecture

Why and how to care about the privacy of your users?

@andreysitnik

“Let’s focus on tech,
  not politics!”

Hackers, 1993

@andreysitnik

Section 1: Software industry and principles

Hackers, 1993

@andreysitnik

Open Source is political

The word “free” in [free software] does not refer to price;
it refers to freedom. […]

The freedom to change a program, so that
you can control it instead of it controlling you.

 

 What is the Free Software Foundation? 1986

@andreysitnik

Hacking is political

Mistrust authority—promote decentralization

 

 Hacker ethic by Steven Levy, 1984

@andreysitnik

Cryptography is political

The decisions we make about communication security today will determine the kind of society we live in tomorrow

 

 Whitfield Diffied, 1993
co-creator of public key cryptography

@andreysitnik

Software development always
has been about principles and politics

Always Has Been meme, unknown author

@andreysitnik

Lack of principles is new

1990s

2010s

Hackers, 1993

Silicon Valley, 2014

@andreysitnik

Section 2: Why I should care?

Hackers, 1993

@andreysitnik

Reason 1: You will live in the world you created

“Just because you do not take an interest in politics
doesn’t mean politics won’t take an interest in you.”

 Write code!

Russian meme from anonymous author

@andreysitnik

Reason 2: Principles create meaning for life

Work just for money

Making the revolution for fun

DALL-E and Hackers, 1993

@andreysitnik

But there are many revolutions to make

Adventure Time

@andreysitnik

Section 3: Why is privacy important?

Hackers, 1993

@andreysitnik

Mistake 1: Is it just for Google for better ads?

😃

@andreysitnik

Mistake 1: Is it just for Google for better ads?

FAKE

Blue Coders

Analytics

Data brokers

@andreysitnik

Fact 1: It is for data brokers for resell

🕵️

Ads

Free
Analytics

Data brokers

Shady clients

@andreysitnik

Case: X-Mode data broker, 2020

“Over 100 apps that sold location data
to sketchy data broker X-Mode”

Quran app, Muslim dating app, Craigslist app, an app for following storms, and a level app that can be used to help install shelves

“X‑Mode had supplied location data to U.S. military contractors

@andreysitnik

Mistake 2: This company doesn’t sell data

We respect your privacy

AFP

@andreysitnik

Mistake 2: This company doesn’t sell data

FAKE

@andreysitnik

Fact 2: If data is stored it can be leaked

@andreysitnik

Case: Yandex Food Delivery data breach, 2022

Was leaked all deliveries 2021-2022:


— First & last name

Phone number

— Food delivery address

— Deliver time

Even public easy-to-use map app,
everyone can find your deliveries

@andreysitnik

Mistake 3: My email is not sensitive data

Windows 11 install wizard

@andreysitnik

Fact 3: Big data connects different leaks

Quran app

Muslim

Locations

Social app

Locations

E-mail

Old breach

E-mail

Full name

@andreysitnik

Google Analytics tracks >52.6% websites

a.com

b.com

c.com

d.com

e.com

f.com

g.com

See click

Referer

Only c.com is invisible for GA

Track connected to your Google account

@andreysitnik

Mistake 4: I have nothing to hide

Dolores Umbridge from Harry Potter

If you have nothing to hide

You have nothing to fear

@andreysitnik

Fact 4: Somebody else has something to hide

“… find personal details identifying critics of the Saudi monarchy who had been posting under anonymous Twitter handles”

“[Saudi Prince], who owns
>5% of Twitter

@andreysitnik

Fact 4: and to fear

54-year-old teacher, Mohammad bin Nasser al-Ghamdi, received
a death sentence for tweeting mild criticism of the authorities
to his 10 followers on Twitter.”

@andreysitnik

In the Netherlands too

“After Russia invaded Ukraine in February 2022, authorities began using facial recognition to prevent people from protesting in the first place”

VisionLabs’ algorithm has been used in Moscow’s facial recognition system”

VisionLabs Global HQ: Johan Cruijff Boulevard 65, Amsterdam

@andreysitnik

And against EU citizens too

Proton Mail [and Apple]
Disclose User Data Leading to Arrest in Spain

The requests were made under the guise of anti-terrorism laws,
despite the primary activities of the
Democratic Tsunami
[Catalan independence organization]

involving protests and roadblocks

EP

@andreysitnik

LLMs with private data can change your beliefs

We find that GPT-4 with personalization has the strongest effect, increasing the odds of higher post-treatment agreement
with opponents by 81.7%.

Without personalization, GPT-4 still outperforms humans,
but the effect is lower
 (+21.3%).

On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial

@andreysitnik

What we can do?

The web became an awful place

The New York Times

@andreysitnik

Isn’t it a dev conference?

@andreysitnik

We made the web an awful place

The New York Times

@andreysitnik

Step 1: Remove GDPR popup

Hackers, 1993

@andreysitnik

But we need popups for GDPR, right?

Fireplugins

@andreysitnik

There is no “popup” in GDPR law

@andreysitnik

Why we added GDPR popups

Punish them with popups until they agree to give us personal data

Don’t

Track

Users

Don’t

Track

Users

Don’t track users

Friends s10, e13

@andreysitnik

Consent popup is just dark design pattern

😈  Popup blocks content
 

😈  UI is unclear
 

😈  The biggest button is Allow

Yes

Yes, but on red

We care about your privacy. Can we spy on you?

@andreysitnik

The real “We care about privacy” way

😻  GDPR compatible analytics
 

😻  No popup


😻  You ask users when you need data
        (for instance, in Sign Up form)

@andreysitnik

Analytics without popup

✅  Page view, browsers, countries
✅  Traffic sources
✅  Track website events
✅  Track campaigns from click to conversion (?utm)
⛔  Can’t connect events with session/user ID
⛔  Can’t collect social network ID for ads (Remarketing)

Plausible

@andreysitnik

There are many Cookieless Tracking tools

Slides

@andreysitnik

But marketing manager is demanding GA

DALL-E

@andreysitnik

Irrational data collection obsession

Verleih Fair & Ugly Filmproduktion

@andreysitnik

Irrational vs rational data collection

What decision you have made in the last year
based on personal data?

@andreysitnik

You can’t trust data only from opt-in users

All users

Your data

Yes on GDPR popup

No on GDPR popup

32—64% of users press Yes
on GDPR banners, Statista

@andreysitnik

Popup only for EU is not an option

GDPR-like laws:

🇧🇷 Brazil: Lei Geral de Proteçao de Dados
🇨🇦 Canada: Digital Charter Implementation Act
🇨🇱 Chile: Ley 19,628
🇪🇬 Egypt: Law No. 151
🇮🇳 India: Personal Data Protection Bill
🇿🇦 South Africa: Protection of Personal Information Act
🇺🇸 USA, CA: California Consumer Privacy Act
🇲🇦 Morocco: Law No. 09-08

@andreysitnik

It is time to change the industry

Hackers, 1993

@andreysitnik

Remember how we together killed IE

Ex-YouTube developer reveals how they ‘conspired to kill IE6’

@andreysitnik

Step 2: Reduce privacy data processors

Hackers, 1993

@andreysitnik

Not only you have access to private data

We Care About Your Privacy

We and our 618 partners store and/or access information on a device, such as unique IDs in cookies to process personal data. You may accept or manage your choices by clicking below or at any time in the privacy policy page. These choices will be signaled to our partners and will not affect browsing data.

@andreysitnik

Who has access to user data?

😈  Third-party JS scripts (especially from other domains)
         Public CDN for JS libs
         Analytics with JS

😈  Website hosting
😈  CDN (Cloudflare see 20% of traffic)
😈  All of their other partners
😈  Mail service, support
😈  etc

Load Third-Party JavaScript, web.dev

@andreysitnik

Less services = less risks

🧐

Hosting

🧐

CDN

🧐

Ads

🤤

Public JS CDN

🥸

JS script from CDN

🧐

Third-party database

→ Leak

→ Sell data

@andreysitnik

How to reduce number of services?

✅  No public CDN for libs (also better performance)

✅  No public CDN for fonts (also better performance)

✅  Self-hosted tools (like analytics)

✅  Combine CDN and cloud

@andreysitnik

Step 3: Local-First

Hackers, 1993

Advanced

@andreysitnik

Advanced step: only for new projects

Hackers, 1993

@andreysitnik

Software before

Local-Only

Svg Vector Icons : http://www.onlinewebfonts.com/icon

@andreysitnik

Then iPhone was released

Apple

@andreysitnik

And now everything is in the cloud

Server-First

Svg Vector Icons : http://www.onlinewebfonts.com/icon

Local-Only

Svg Vector Icons : http://www.onlinewebfonts.com/icon

@andreysitnik

Server-First

Svg Vector Icons : http://www.onlinewebfonts.com/icon

Local-Only

Local-First

Svg Vector Icons : http://www.onlinewebfonts.com/icon
Svg Vector Icons : http://www.onlinewebfonts.com/icon

Local-First as a third way

Rich client keeps data and processing locally,
the server is just for sync

Svg Vector Icons : http://www.onlinewebfonts.com/icon

@andreysitnik

The idea was presented by Ink & Switch

Seven ideas:

  1. No spinners (local data fast to change)
  2. Sync between devices
  3. Offline-first
  4. Conflict-free collaboration
  5. App will work when company closes
  6. Privacy by default
  7. User owns data

@andreysitnik

Notion vs Obsidian

Notion
Server-First

Obsidian
Local-First

🗒️

Local files
notes/Shopping.md
notes/Ideas.md

Obsidian Sync & Publish

💻

📱

💻

🗒️

📱

🗒️

GitHub repo

Any Cloud Sync

@andreysitnik

Local-First isn’t on-off switch, but a spectrum

Sync Engine with
partial DB replica
on the client

P2P
End-to-end encryption
No server

@andreysitnik

Step 1: Storage

All client’s data stores locally

More data on the client (20-500 MB)

We need proper client-side database:

— Good performance

— Rich query language

@andreysitnik

Client-side databases

SQLite WASM

PGlite (PostgreSQL in WASM)

import { SQLocal } from 'sqlocal'

const { sql } = new SQLocal('database.sqlite3')
const data = await sql`SELECT * FROM posts`
import { PGlite } from "@electric-sql/pglite"

const db = new PGlite()
await db.query("select * from posts;")

@andreysitnik

Persistence DB storage for SQLite

  1. OPFS (Origin Private File System)
  2. IndexedDB

Firefox uses SQLite for
IndexedDB & OPFS metadata

@andreysitnik

Why not IndexedDB?

  1. SQLite works on ReactNative and desktop
  2. IndexedDB query API is limited
  3. Performance

@andreysitnik

Step 2: Add log and separate read and write

Log

SQLite

Reactive store

Updates

Initialization

UI

@andreysitnik

We need to sync server and browser tabs

Log

SQLite

@andreysitnik

Step 3: CRDT* to revolve conflicts

One source of truth

Everyone is a “server”

* — simple Map/Set is enough.
        No need for complex Google Docs-like collaboration.

id: nanoid() random ID, no sequence ID

@andreysitnik

Step 4: 2 passwords for end-to-end encryption

1st password to auth on the server

2nd password for end-to-end encryption

@andreysitnik

Unexpected benefits of Local-First

Benefit 1: Very simple server

Sync changes
Auth
Check access for collaboration

All business logic

All data management

@andreysitnik

Benefit 2: No server in prototype stage

project/
  client/

@andreysitnik

Benefit 3: Cheap scale

Server-first:

Local-first:

@andreysitnik

Benefit 4: No private data → no problem

DALL-E

@andreysitnik

Benefit 5: Developer productivity

@sitnikcode

useEffect(() => {
  setLoading(true)
  fetch('/posts')
    .then(response => response.json())
    .then(data => {
      setData(data)
      setLoading(false)
    })
    .catch(e => {
      setError(true)
      setLoading(false)
    })
}, [])

if (loading) return <div>Loading…</div>
if (error) return <div>Error</div>

<ul>
  {data.map(post => <li>{post.title}</li>)}
</ul>
const data = use(posts)
<ul>
  {data.map(post => <li>{post.title}</li>)}
</ul>

State management
Networking code

Cache

Network errors

Optimistic UI

Sync Engine

@andreysitnik

Benefit 6: Local data = 0 latency

@andreysitnik

Local-First Startups: Linear

With great privacy comes great responsibility

Linear website

@andreysitnik

Local-First Startups: Linear

With great privacy comes great responsibility

Why does Linear feel so fast?
Local first optimistic updates

 

 Jori Lallo at devtools.fm

@andreysitnik

Local-First Startups: Pitch

Pitch.com

1.7 million teams
using app

@andreysitnik

Local-First Startups: Goodnotes

goodnotes.com

>24 million
monthly users

@andreysitnik

There are frameworks for LoFi

Evolu

ElectricSQL

RxDB

@andreysitnik

Hard part 1: Frameworks are not 100% ready

While none of these projects currently claim “production level stability” for persistence, the current support could already
be sufficient for some projects

 

 The Current State of SQLite Persistence on the Web

@andreysitnik

Hard part 2: Client’s database migrations

const migrations = {
  1: action => {
    if (action.type === 'posts/created') {
      return { type: 'news/created', news: action.post }
    }
  }
}

@andreysitnik

Hard part 3: per-document access with E2EE

Publish public key

New E2EE key for doc

Encrypted document

Enc msg with key

@andreysitnik

Hard part 4: E2EE password recovery

With great privacy comes great responsibility

Spider-Man

@andreysitnik

Again: Local-First is a spectrum

Partial DB replica
on the client

P2P
End-to-end encryption
No server

@andreysitnik

Read Guides

@andreysitnik

Step 4: Privacy from non-US perspective

Hackers, 1993

Advanced

@andreysitnik

Risks are different in different countries

India

WhiteEmperor420 on Reddit

@andreysitnik

Advanced step: for big & popular projects

Hackers, 1993

@andreysitnik

Different privacy risks

🕵️  Government’s Secret Service
🪤  Surveillance for regime critics
📶  Internet provider
☁️  Data brokers
🏬  International companies collecting private data
👮  Phone check by the local police officer
⛪  Local community with ethical standards
👪  Family members

@andreysitnik

US media focus mostly on

🕵️  Government’s Secret Service
      Surveillance for regime critics
      Internet provider
☁️  Data brokers
🏬  International companies collecting private data
      Phone check by the local police officer
      Local community with ethical standards
      Family members

@andreysitnik

Different risks need opposite solutions

RSS Reader privacy risks

🇺🇸 US: local-first
don’t trust cloud

🇷🇺 Russia: US cloud proxy
to hide you from Internet provider

🤫

🏯

🏥

🕌

☁️

🤫

🏯

🏥

🕌

☁️

🔒

🕵️

📶

🕵️

🔓

@andreysitnik

Phone check by local police in 🇷🇺 🇧🇾

“Unlock your phone and show Telegram”

Andrey Lukovsky

“I have rights”

1234

🧑‍⚖️

🧑‍🦽

Navalny

Following

@andreysitnik

Telegram fork by Belarusian Cyber-Partisans

1234

Navalny

Following

1984

You can have 2 PINs

CSS hacks

GitHub trends

Following

CSS hacks

GitHub trends

@andreysitnik

Summary

Hackers, 1993

@andreysitnik

For next working day

@andreysitnik

❤️  Find your principles
❤️  Remove GDPR popup by using cookieless analytics
❤️  Reduce services with access to private data
🌟  Think of Local-First in next project
🤔  Think of other privacy risks if you make a social tool

Thanks

Slides:

Privacy-first architecture

By Andrey Sitnik

Privacy-first architecture

  • 2,051