It's not your parents' HTTP

Dr Gleb Bahmutov PhD

ConFoo.CA

About me

C / C++ / C# / Java / CoffeeScript / JavaScript / Node / Angular / Vue / Cycle.js / functional

EveryScape

virtual tours

MathWorks

MatLab on the web

Kensho

finance dashboards

VP of Eng at

Fast, easy and reliable testing for anything that runs in a browser.

Cypress in action https://www.cypress.io/

400 blog posts

I ❤️ sharing knowledge

Science: 📚

or

☠️

12 March

1989

vague but exciting

Edit and view HTML documents, 1991

source: http://digital-archaeology.org/the-nexus-browser/

first website: info.cern.ch

1991

HTTP/0.9

1991

Sir Tim Berners-Lee

Professor of Engineering in the School of Engineering with a joint appointment in the Department of Electrical Engineering and Computer Science at MIT

HTTP/0.9

1991

650 words

3800 characters

HTTP/0.9

1991

HTTP/0.9

1991

HTTP/0.9

If the port number is not specified, 80 is always assumed for HTTP

because port 79 and 81 were taken (RFC 1060)

1991

HTTP/0.9

The TCP-IP connection is broken by the server when the whole document has been transfered

1991

GET /fuzzy_bunnies.txt

HTTP/0.9

1991

The response to a simple GET request is a message in hypertext mark-up language ( HTML ). This is a byte stream of ASCII characters.

HTTP/0.9

1991

No error response codes

HTTP/0.9

1991

Only GET

No cookies / sessions / server state

HTTP/0.9

Y - you

A - aren't

G - gonna

N - need

I - it

What Problem Does HTTP/0.9 Solve?

<html>
  <p>More info
    <a href="http://foo.com/bar.html">here</a>
  </p>
</html>

I want to read this HTML document

GET /bar.html

<html>
  ...
</html>

HTTP relies on TCP to ask for and receive document

TCP

IP

TCP

IP

how?

where?

HTTP

HTTP performance is tied to TCP properties

1. Guaranteed data delivery

2. Guaranteed packet order

1991

Future HTTP protocols will be back-compatible with this protocol.

HTTP/0.9

1991 was

a few years ago ...

World Wide Web went from zero to "everywhere" in about two years

Then:

Date # of web sites
'93 130
'94 2700
'95 23,500
'96 100,000
<html>
  <p>See diagram
    <img src="http://foo.com/bar.jpg" />
  </p>
</html>

I want to see this JPEG image

Non-academic users:

HTTP/1.0

We Are Gonna Need It (WAGNI)

HTTP/1.0

1996

RFC 1945

60 pages

GET /mypage.html HTTP/1.0
User-Agent: NCSA_Mosaic/2.0 (Windows 3.1)

200 OK
Date: Tue, 15 Nov 1994 08:12:31 GMT
Server: CERN/3.0 libwww/2.17
Content-Type: text/html
<HTML> 
A page with an image
  <IMG src="/myimage.gif">
</HTML>

HTTP/1.0

  • Non-HTML data (images!)

  • Methods (HEAD & POST)

  • Status Codes

  • User preferences (User-Agent)

  • Format negotiation (Content-Type)

1996

RFC 1945

HTTP/1.1

Formalize best practices

HTTP/1.1

1999

RFC 2616

27 drafts, latest 2014

The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems.

HTTP/1.1

  • Performance (Caching, connections)

  • Security (HTTPS, 3rd party cookies)

  • Usability (IP scarcity, error codes)

1999

RFC 2616

HTTP/RFC

IETF

The Internet Engineering Task Force (IETF)

make the Internet work better from an engineering point of view

The IETF's official products are documents called RFCs (Requests for Comments)

GET /bar.html

<html>
  title: bar
</html>

*** **

bar.html

** ***

index.html

title: bar

*** **

bar.html

Give me the page ...

HTTP/1.1 is still /0.9

request

response

* I am really ignoring all server-to-server communication happening over HTTP

*

HTTP/1.1 is uniform and everywhere

Reality: multiple versions, implementations, extensions, working drafts

"The Tangled Web: A Guide to Securing Modern Web Applications"

by Michal Zalewski

ISBN-13: 978-1593273880

👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍

What if we send HTTP request to SMTP?

GET /<html><body><h1>Hi! HTTP/1.1 
Host: example.com:25
220 example.com ESMTP
500 5.5.1 Invalid command: "GET /<html><body><h1>Hi! HTTP/1.1" 
500 5.1.1 Invalid command: "Host: example.com:25"
...
421 4.4.1 Timeout

If the browser blindly trusts returned text to be HTML this is reflected XSS attack

Are we talking about the same thing?

When I open a browser ...

HTTP Request and Response

State (cookies)

Cache control

Security, security, security

tracking pixels / cookies

58 requests to load

this page!

Related: Website Obesity Problem

Average page size: 2MB !

images, fonts, styles, scripts, ads

AJAX:

From websites

to web apps

i.e. "avoiding page reloads"

GET /index.html

*** **

** ***

<iframe src="bar.html">

index.html

GET /bar.html

1996: iframe in Internet Explorer

XMLHttpRequest

Gets the data from the server by a client side script

1999 Internet Explorer plugin

2004 Google Gmail, Kayak.com

2006 first spec draft

2014 latest spec draft

HTTP/1.1: Today

IP

TCP

HTTP

REST

GraphQL

2000

2012

Client - Server communications

<div> ... </div>

<script>

....

</script>

index.html

<div> ... </div>

<script>

....

</script>

index.html

HTTP/1 => HTTP/2  or QUIC

WebSockets

WebRTC

ServiceWorker

browser

browser

HTTP is dead

Long live HTTPS

The browser APIs worth discussing all require secure connections

http://localhost or https://...

Usually a single command!

... 250k new certificates per day ...

Browser

Server

HTTPS

connections are expensive because

TCP + TLS handshakes!

data

ServiceWorker

Blurring the line between client and server

If a tree falls while you are in the forest ...

<html manifest="example.appcache">
  ...
</html>

Application Cache

CACHE MANIFEST
# v1 2011-08-14
index.html
style.css
image1.png
# Use from network if available
NETWORK:
network.html
# Fallback content
FALLBACK:
/ fallback.html

declarative list

Application Cache

Turns out declaring caching strategy is hard.

ServiceWorker

Server

browser

Web Workers

ServiceWorker

Transforms

the response

Transforms

the request

Smart caching

OFFLINE SUPPORT

Image / video transcoding

Background data sync

Load ServiceWorker

navigator.serviceWorker.register(
    'app/my-service-worker.js')

Chrome, Opera, Firefox

Must be https

Inside ServiceWorker

self.addEventListener('install', ...)
self.addEventListener('activate', ...)
self.addEventListener('message', ...)
self.addEventListener('push', ...)
self.addEventListener('fetch', function (event) {
  console.log(event.request.url)
  event.respondWith(...)
})
// Cache API

ServiceWorker (and more)

  • WebWorkers

  • ServiceWorker

What if an attacker can load malicious ServiceWorker script?

Malicious ServiceWorker injected via XSS can be really hard to get rid of

Please protect yourself from XSS

Web Today

Ajax

ServiceWorker

HTTP/2

QUIC

HTTP/2

Making web faster

*

HTTP/1.1

  • How to quickly load 200 resources?
  • How to load some resources first?
  • How to avoid duplicate data overhead?

Source: Ilya Grigorik https://bit.ly/http2-opt

2009 - Google starts SPDY

Need for Speed 🚤

2015 - RFC 7540

binary, multiplexed protocol - 55% speed up on top sites!

H2: Binary Framing Layer

H2: Multiplexing

H2: Stream Prioritization

3:1

then

then

then

3:1

$ curl -I https://github.com
HTTP/1.1 200 OK
Server: GitHub.com
Date: Sat, 03 Jun 2017 03:08:50 GMT
Content-Type: text/html; charset=utf-8
Status: 200 OK
Cache-Control: no-cache
Vary: X-PJAX
X-UA-Compatible: IE=Edge,chrome=1
Set-Cookie: logged_in=no; domain=.github.com; path=/; 
expires=Wed, 03 Jun 2037 03:08:50 -0000; secure; HttpOnly
Set-Cookie: _gh_sess=eyJzZXNzaW9uX2lkIjoiYzFlYTc0YTQ0OTNhZTdlNzI4MTgwNzI5N2QyNTlkNWMiLCJfY
3NyZl90b2tlbiI6IngzaHhRREMxNEpLUFNid1ZmMkc4d0N1OG1xdjZ3MkdmWmh4YkRFazNYQkU9In0%3D--b7171bd148da0b79b248fb561b8bfd4aadf16ff5; path=/; secure; HttpOnly
X-Request-Id: 75c9420698c53b1a707b10b2a7a510cc
X-Runtime: 0.056786
Content-Security-Policy: default-src 'none'; base-uri 'self'; block-all-mixed-content; 
child-src render.githubusercontent.com; connect-src 'self' uploads.github.com 
status.github.com collector.githubapp.com api.github.com www.google-analytics.com 
github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com 
github-production-user-asset-6210df.s3.amazonaws.com wss://live.github.com; font-src 
assets-cdn.github.com; form-action 'self' github.com gist.github.com; frame-ancestors 
'none'; img-src 'self' data: assets-cdn.github.com identicons.github.com 
collector.githubapp.com github-cloud.s3.amazonaws.com *.githubusercontent.com; 
media-src 'none'; script-src assets-cdn.github.com; style-src 'unsafe-inline' 
assets-cdn.github.com
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
Public-Key-Pins: max-age=5184000; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18="; pin-sha256="RRM1dGqnDFsCJXBTHky16vi1obOlCgFFn/yOhI/y+ho="; pin-sha256="k2v657xBsOVe1PQRwOsHsw3bsGT2VzIqz5K+59sNQws="; pin-sha256="K87oWBWM9UZfyddvDfoxL+8lpNyoUB2ptGtn0fv6G2Q="; pin-sha256="IQBnNBEiFuhj+8x6X8XLgh01V9Ic5/V3IRQLNFFc7v4="; pin-sha256="iie1VXtL7HzAMF+/PVPR9xzT80kQxdZeJ+zduCB3uj0="; pin-sha256="LvRiGEjRqfzurezaWuj8Wie2gyHMrW5Q06LspMnox7A="; includeSubDomains
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
X-Runtime-rack: 0.060928
Vary: Accept-Encoding
X-Served-By: e878d09eac725c89f5f15204c1326660
X-GitHub-Request-Id: F1E1:2F2B:6F2906B:A48C07A:59322842

2100 characters

~55% of HTTP/0.9 spec!

H2: Header Compression

pseudo headers

H2: Header Compression

Common headers static table on client and server

14% of all websites (Jun '17)

HTTP/2 

was 8% Jun '16

SPDY is at 8%

“One great metric around that which I enjoy is the fraction of connections created that carry just a single HTTP transaction (and thus make that transaction bear all the overhead). For HTTP/1 74% of our active connections carry just a single transaction – persistent connections just aren’t as helpful as we all want. But in HTTP/2 that number plummets to 25%. That’s a huge win for overhead reduction.” — Patrick McManus, Mozilla.

H2 is succeeding in minimizing delays

HTTP/2 Information

HTTP/2 Server Push

I know what you want

<html>
<img src="photo.jpg"/>
</html>
...
<img src="photo.jpg"/>
...

GET index.html

Server

GET photo.jpg

Client

?

photo.jpg

index.html

time

Server Push

<html>
<img src="photo.jpg"/>
</html>

Server

GET index.html

Client

photo.jpg

index.html

...
<img src="photo.jpg"/>
...

GET photo.jpg

Already have it!

time

Server Push

<html>
<link rel="stylesheet" 
  href="app.css">
<script src="app.js">
</script>
</html>

Server

GET index.html

Client

app.css

index.html

app.js

Great way to replace HTTP/1.1 practice of inlining small styles, scripts, images in the page

time

const server = require('spdy')
const express = require('express')
const app = express()
app.use(express.static('public'))

const tlsOptions = {
  key: fs.readFileSync('./server.key'),
  cert: fs.readFileSync('./server.crt')
}
const port = 5000
server.createServer(tlsOptions, app)
  .listen(port, err => { ... })

Server Push in NodeJS

const image = fs.readFileSync('./image.jpg')
const options = {
  request: {accept: 'image/*'},
  response: {'content-type': 'image/jpeg'}
}
function serveHome (req, res) {
  if (res.push) {
    const stream = res.push('/image.jpg', image, options)
    stream.end(image)
  }
}
app.get('/', serveHome)

Server Push in NodeJS

Push resource if client/server/proxy supports it

No NGINX support as of March '17, use https://github.com/indutny/bud

*

Server Push always pushed the resources even if they might be already in the browser's cache.

Server Push Behavior

"Being Pushy" by Yoav Weiss

"HTTP/2 push is tougher than I thought" by Jake Archibald

"cache-digest"

<html>
<link rel="stylesheet" 
  href="app.css">
<script src="app.js">
</script>
</html>

Server

GET index.html

Client

index.html

I have "app.css" <hash>,

"app.js" <hash>

Not going to push app.css and app.js

time

My opinion:

Wait for Server Push to be refined

HTTP/2 is changing the best performance practices

H2 changes some performance best practices

Source: Ilya Grigorik https://bit.ly/http2-opt

DO LESS WORK

H2: 4 main features 

Multiplexing

Multiple streams over same TCP connection

Compression

Headers and binary frames

H2: 4 main features 

Flow control

Dependency mechanism among resources

Resource push

Server preemptively pushes resources to the client

Web Today

Ajax

ServiceWorker

HTTP/2

QUIC

HTTP/2 Achilles' heel

tip: how to fill the feedback form

H2: If a packet is lost

TCP: all have to wait!

Hooman Beheshti - HTTP/2: What no one is telling you

PLR - Packet Loss Rate

IP

TCP

HTTP/1.1

WebSockets

HTTP/2

TLS

Order of packets guarantee

TCP

HTTP/1.1

WebSockets

HTTP/2

TLS

UDP

IP

QUIC

QUIC

like SPDY but over UDP

QUIC works hard to minimize latency, which other protocols (like SCTP over DTLS over TCP/UDP) where not designed to do

'99

2009

2015

HTTP/1.1

SPDY

HTTP/2

QUIC

Secret: many Google properties communicate with Chrome using QUIC

HTTP/0.9

'91

HTTP/1 - SPDY - HTTP/2 - QUIC timeline

2013

(deprecated)

Enable QUIC: chrome://flags/#enable-quic

chrome://net-internals/#quic

Visit Google property

HTTP/3 Explained

Daniel Stenberg

Nov 2018

HTTP over QUIC becomes HTTP/3

Spec ready by middle 2019?

No plans from Apache or Nginx

Honorable mention: WebRTC data channel

Server

Content is coming from the server

As more users connect ...

Server

We need a bigger server 💸

Server

We need a bigger server 💸

🔥🔥🔥

HBO's Silicon Valley "Two Days of the Condor", season 2 finale

P2P using data channel in WebRTC

Server

offline != server is down

Checkout:

A peer-to-peer hypermedia protocol
to make the web faster, safer, and more open.

Real-Time Communications capabilities via simple APIs

Dat is the distributed data sharing tool

a streaming torrent client for the web browser and the desktop

HTTP: Conclusions

Every version of HTTP

and related protocols was a solution to a real-world problem

HTTP: Conclusions

Most of the time the problem was not apparent until real world adoption

HTTP/0.9 - loading remote HTML documents

HTTP/1.x - popularity and growth of World Wide Web

Ajax - loading data without full page reload

WebSockets - realtime communication with the server

ServiceWorker - scriptable caching

HTTP/2 - performance for loading modern websites

QUIC & HTTP/3 - an alternative take on performance for loading modern websites

Only failed products stay unchanged

Tomorrow a new protocol will appear to solve problems that HTTP/2 and HTTP/3 will exhibit

It's not your parents' HTTP

ConFoo.CA

Gleb Bahmutov @bahmutov

Happy Birthday, WWW

It's not your parents' HTTP - Confoo.CA

By Gleb Bahmutov

It's not your parents' HTTP - Confoo.CA

The Internet of simple textual requests and responses is done. Finished. Obsolete. The modern web is a web of binary, persistent connections like WebSockets, WebRTC, HTTP/2 and QUIC. Today's Internet is a strange place where things are received before they are requested (Server Push) and a web application work without the Web (offline support with Service Worker). This presentation is going to be your map to this new terrain.

  • 3,986