INTERNET PRIVACY

Daniel Coloma

"I have nothing to hide... so I don't care about my privacy"

HAVE YOU EVER SAID...

HAVE YOU EVER WONDERED...

HAVE YOU EVER WONDERED...

Why elpais.com is showing me an ad about the Lego game I wanted to give to my son for Christmas?

... I never read any news about Lego in elpais.com and I already bought it!

... THE ANSWER IS ON YOUR PERSONAL DATA

HAVE YOU EVER WONDERED...

Why booking.com is offering me this rate for this hotel?

... and my friend Peter is getting a cheaper rate for exactly the same hotel, same room, same dates!

... THE ANSWER IS ON YOUR PERSONAL DATA

AN EXAMPLE OF WHAT HAPPENS BEHIND THE CURTAINS

ONLINE ADVERTISING

WHO DECIDES WHICH ADS SHOULD BE SHOWN TO YOU?

THE PUBLISHERS?

THE ADS TO BE SHOWN ARE USUALLY DECIDED BY SOME OTHER COMPANIES CALLED AD-EXCHANGE BROKERS

NOT REALLY

HOW DO THE BROKERS DECIDE WHICH ADs YOU SHould BE SHOWN?

AD-EXCHANGE BROKERS RUN AUCTIONS

WHO ARE THE BIDDERS?

THE ADVERTISERS

WHO ARE THE BIDDERS?

BUT WHAT IS THE GOOD BEING AUCTIONED?

THE SPACE IN THE NEWS WEB SITE?

THEY BID FOR THE USERS WATCHING THE AD!

 

THEY BID FOR YOU!

BUT IN THE SAME WAY YOU WOULDN'T BID FOR AN ANONYMOUS PAINTING

 

 

THEY WOULDN'T BID FOR ANONYMOUS USERS

SO THEY PROFILE YOU

THEY DON'T NEED TO KNOW YOUR NAME (MAYBE THE DO) TO KNOW YOU BETTER THAT SOME OF YOUR FAMILY AND FRIENDS

SO THEY NEED TO KNOW WHO ARE YOU

WHO ARE YOU IS NOT YOUR NAME

A BIG SET OF DATA ABOUT YOU, TOGETHER WITH GOOD ALGORITHMS MAY PROVIDE AN EXCELLENT PICTURE OF YOU

SO THEY PROFILE YOU

  1. HOW THEY COLLECT INFORMATION FROM YOU?

  2. WHAT TYPE OF INFORMATION IS COLLECTED

  3. WHO IS COLLECTING THAT INFORMATION

  4. HOW IS THAT INFORMATION FLOWING

  5. HOW THAT INFORMATION IS USED... AND HOW DOES IT AFFECT YOU

WHAT ARE WE GOING TO LEARN TODAY?

 1 - WHICH TRACKING TECHNIQUES ARE USED?

Before creating a Facebook account Peter wants to check Facebook Privacy Policy. So he goes to Google and looks for "facebook data policy" https://www.facebook.com/policy.php

He opens the firs link: https://www.facebook.com/policy.php and after checking the policy decides NOT TO CREATE an account

Later on he wants to check some information about cancer in a health forum and he opens: http://salud.ccm.net/forum/cancer-8

A simple story...

Peter THINKS Facebook doesn't know anything about him... is he right?

No, he is not!

When Peter visited Facebook policy page, Facebook stored some cookies in his computer

  • A random identifier of the browser that is scoped to the Facebook root domain (i.e. the cookie will be sent every time a resource is retrieved from Facebook.com), first and last Facebook visited pages, etc.

When Peter visited the health forum, a Facebook plugin was loaded, as it's hosted in Facebook domains, the cookies are sent back to Facebook

  • All the information stored so far (Browser ID, Last/First visited pages, etc.) and the referrer (the page in which I am) and if a "Like" button is present, the page I would like in case I press it

Maria WAS so worried about her privacy that never visited a Facebook page...

She is pregnant and visited prenatal.com

Facebook is also profiling her!

  • Prenatal website is making a request to "pixel.facebook.com" which in response establishes a cookie in Maria's computer

 

  • This is done in many Websites: Facebook can built her profile even if she has never visited a Facebook site

But I heard that I can opt-out

http://www.youronlinechoices.eu/

Select the option to "turn-off" all the companies.

What do you think it happens afterwards?

A new (opt-out) cookie is set!

  • FACEBOOK PLACED A COOKIE NAMED “OO” WITH THE VALUE “1”. “OO” PRESUMABLY STANDS FOR “OPT-OUT”.

  • THE OTHER COOKIES WERE NOT REMOVED BY FACEBOOK DURING OR AFTER THE OPT-OUT

  • ALL THE COOKIES ARE SENT BACK TO FACEBOOK ANY TIME A FACEBOOK RESOURCE IS LOADED

REMEMBER: THIS IS JUST FOR NON-FACEBOOK USERS

I DON'T HAVE TIME TO TALK ABOUT WHAT HAPPENS TO FACEBOOK USERS

So... if all this tracking occurs via "cookies" what if I disable cookies or remove them?

Very smart...


But do you think you are smarter than the trackers?

When tracking companies detected that many users blocked cookies they thought in alternatives

ALTERNATIVE 1 - "FLASH COOKIES"

  • More Storage capability (100K vs 4K)
  • No default expiration dates
  • Stored in a different and separate location
  • Not controlled by the browser

A more resilient technology for tracking than HTTP cookies where less user control.

THE "RESPAWNING" IDEA

Browser

cookies

Flash cookies

User removes Browser cookies

When cookie removal is detected they re-built using a exact copy that is available in the Flash cookies

Browser

cookies

Flash cookies

An exact copy of browser cookies is kept in -sync in Flash Cookies

THE PROBLEM WITH FLASH COOKIES

Flash is not universally available in all the browsers (usage is decreasing over time)

Adobe improved Flash to prevent the mis-use of this technology

ALTERNATIVE 2 - "EVERCOOKIES"

Make use of all the possible technologies to store information available in the browser: HTTP cookies, IndexedDB, Local Storage, etc.

Browser cookies

User removes

Browser and Flash cookies

An exact copy of browser cookies is kept in-sync in different storage locations

Flash cookies

IndexedDB

Local Storage

Etags

Browser cookies

Flash cookies

IndexedDB

Local Storage

Etags

The only way to complete remove an "evercookie" is doing it in all the places at the same time

"RESPAWNING" REVISITED

ALL THESE TECHNIQUES HAVE 1 THING IN COMMON

They store information in your computer:

 

STATEFUL TECHNIQUES

The tracker workaround are STATELESS TECHNIQUES

 

don't require storing anything in your computer

 "fingerprinting"

 

Look for ways to uniquely identify your browser

Canvas Fingerprinting

Any Web Site can draw graphics to a custom canvas element in real time via the HTML5 Canvas Feature. Differences in font rendering, smoothing, anti-aliasing, etc. cause different devices/browsers to draw the same image differently. If the image is defined in a smart way, the resulting pixels are unique per device/browser, and hence it allows the device to be fingerprinted.

Font Fingerprinting

The browser font list has been used for a long time as a way to fingerprint devices. However, using it in combination with the canvas feature (displaying up to 50 different fonts and checking how are they rendered in the canvas) has made this identification technique very popular.

Audio Fingerprinting

The latest fingerprinting technique that has been discovered is called audio Context fingerprinting. From a conceptual point of view is similar to Canvas Fingerprinting, the Web Site creates an Audio Context and request the audio processing of a signal. The same signal processed on different machines/browsers may have slight differences due to Hardware/Software differences, which can be used again to fingerprint the machine/browser.

WebRTC Fingerprinting

WebRTC is a framework to offer Real Time Communications in the browser. To discover the best network path between the communicating parties, the WebRTC API can be used to collect all available candidate addresses (local network IPs, public NAT IPs, etc.). Any Web site could use this API to collect those IP addresses without explicit permission from the user. Obtaining the local IP address via WebRTC has been discovered as a new way used by trackers to fingerprint a device.

You are being profiled no matter what!

As a user... if you are going to track me, please do use cookies

... and please don't use anything else.

 

But if you USE OTHER TECHNIQUES, please do let me know!

2 - WHAT INFORMATION IS BEING COLLECTED ABOUT ME?

LOCATION DATA

WiFi
GPS
Carrier
IP

TECHNICAL DATA

Operating System
Web Browser
Screen Resolution
Hardware Manufacturer
IP Address
Installed Plugins

BEHAVIOURAL DATA

Browsing History
Ads Seen / Clicked
Search Queries
Purchasing History
Social Media
Referrals
Browsing Habits

DEMOGRAPHIC DATA

ADDRESS
ZIP CODE
NAME
AGE
GENDER

BUT THOSE ARE JUST SOME INGREDIENTS

THEY CAN INFERE A LOT MORE ABOUT YOU BY COMBINING THEM ON A SMART WAY

THE WHOLE SYSTEM IS DESIGNED IN A WAY FOR INFORMATION TO BE SHARED CROSS-SITE

ALL THE TRACKERS ARE CONTINUOUSLY ENRICHING THEIR PROFILES

3 - WHO IS COLLECTING THAT INFORMATION?

Facebook, Google...?

PUBLISHER

USER

Publishers make their living from selling ad-space to advertisers. Examples of publishers are news sites, social media, search engines, etc.

PUBLISHER

AD EXCHANGE

USER

But Publishers don't send directly the space to advertisers. They sell the space in some special marketplaces named "Ad Exchange" which act as neutral platforms.

EVERY SECOND 1.3MILLION USERS ARE SOLD IN AD-EXCHANGES

PUBLISHER

AD EXCHANGE

DEMAND-SIDE-PLATFORM

USER

They are the ones that bid for users and serve them ads real-time on behalf of advertisers based on some rules.

 

PUBLISHER

AD EXCHANGE

DEMAND-SIDE-PLATFORM

ADVERTISER

USER

They are the ones that want to increase their sells by showing ads to users and become more relevant.

PUBLISHER

AD EXCHANGE

DEMAND-SIDE-PLATFORM

ADVERTISER

USER

They are the only ones that don't sell ads or help selling more ads directly. They make a living selling user profiles and market analysis.

DATA BROKERS

Use users information to put them into categories such as "urban and eco-friendly"

AND THIS IS JUST A SIMPLIFIED VIEW

  • Changing Ecosystem

  • Boundaries are unclear

  • Many companies play in different categories (e.g. Google)

THE MARKETING TECHNOLOGY LANDSCAPE

4 - HOW DOES THE INFORMATION FLOW?

www.newspaper.com

The ad-exchange sends an "ad-call": "You have an opportunity to advertise to a user with Profile and ID"

Apart from rendering the Website, your browser sends an "ad-tag" to the ad-exchange

The AD-EXCHANGE knows that there is ad-space for a bid... but most importantly, it can now retrieve your cookies. The cookies contain the ID the ad-exchange assigned to you the first time you "visited" it and extra-info: Profile

You visit a news site

1

2

3

4

All DEMAND-SIDE-PLATFORM candidates retrieve their cookies from your computer

Request extra information about you to DATA-BROKERS

5

6

Perform cookie-matching with all the info they have about you and decide how much they can bid

7

The AD-EXCHANGE checks all the offer and assigns the space to the Demand-Side-Platform with the highest bid

8

The winner Demand-Side-Platform places one ad from their advertisers at www.newspaper.com

9

$0.1

$0.09

$0.09

A SIMPLIFIED FLOW OF HOW AN AD IS SHOWN

www.newspaper.com

WHAT THE USER PERCEIVES WHEN VISITING A NEWS WEB SITE

IN 200 MSECS HE GETS THE INFORMATION FROM THE WEB SITE, SOME ADS APPEAR MIXED WITH THE CONTENT

«A site is not one company any more. A site is tens of hundreds of companies all knowing where you are and what you’re looking at.»

RESEARCH CARRIED OUT BY NORWEGIAN DATA PROTECTION AUTHORITY ABOUT THE NUMBER OF THIRD PARTY TRACKERS ON TOP-6 NORWEGIAN NEWS SITES

11 AD-EXCHANGES

12 DEMAND-SIDE-PLATFORMS

12 DATA MANAGEMENT PLATFORMS

8 DATA BROKERS

13 DATA ANALYTICS COMPANIES

COLLECT INFORMATION ABOUT A USER THAT HAS JUST VISITED THOSE 6 NEWS SITES

 5 - HOW IS THAT INFORMATION USED?

THE INFORMATION IS MOSTLY USED TO TAKE DECISIONS

BUT ALL THOSE DECISIONS ARE DONE IN A OBSCURE WAY: RISKS LINKED TO THE LACK OF TRANSPARENCY

RISK OF WRONG DECISIONS

EXAMPLE: DENY/ALLOW MEDICAL INSURANCE

What if the data you have about me is wrong?

RISK OF MANIPULATION

EXAMPLE: SHOW AN AD

What if the ad does not only show content they think is relevant to me, but also shows to me in a way that exploits "my vulnerabilities" (impulsive, cautious, etc.)

RISK OF HIDDEN DISCRIMINATION

EXAMPLE: CREDIT RATING

Algorithms taking decisions are written and maintained by people and as such, they can reinforce human prejudices. For instance, it was found that Google displayed ads about high-income jobs to men more often than to women.

RISK OF PRICE DISCRIMINATION

EXAMPLE: QUOTATIONS

Can I get a higher price just because I use a MAC or because my incomes are higher?

RISK OF FILTERING BUBBLE

EXAMPLE: INTERNET SEARCHES

What if the search results filter results not aligned with my viewpoints? This would isolate me in my ideological bubble

ARE YOU SURE THEY CAN KNOW THAT MUCH ABOUT ME?

LET'S HAVE A LOOK AT HOW FACEBOOK LET ADVERTISERS TARGET USERS

HAVE YOU TOLD THIS TO FACEBOOK?

AND THIS?

IS IT SCARY?

IT COULD BE WORSE

CONCLUSIONS

No Industry in the world knows more about you than the ad industry:

 

DATA RACE

ASYMMETRIC RACE

What they know about me

What I know about them

It AFFECTS EVERYONE but very FEW PEOPLE have any INSIGHT about it

As there is no transparency, companies do not need to compete in provider consumer privacy-friendly services

The LACK OF TRANSPARENCY is extremely DANGEROUS: manipulation, wrong decisions, discrimination, re-identification can be done in the dark

BUT THINGS CAN CHANGE!

BUT FOR THINGS TO CHANGE, YOU NEED TO ACT!

I AM GOING TO ASK YOU THREE THINGS

1 - DO CARE ABOUT YOUR PRIVACY

2 - DEMAND SERVICES TO BE MORE TRANSPARENT

3 - EDUCATE YOUR FRIENDS, KIDS, ETC. ABOUT WHAT IS GOING ON

THANKS

TEFCONF-v2

By Daniel Coloma

TEFCONF-v2

  • 718