Gleb Bahmutov PRO
JavaScript ninja, image processing expert, software quality fanatic
C / C++ / C# / Java / CoffeeScript / JavaScript / Node / Angular / Vue / Cycle.js / functional
EveryScape
virtual tours
MathWorks
MatLab on the web
Kensho
finance dashboards
Cypress in action https://www.cypress.io/
The word engineer (Latin ingeniator) is derived from the Latin words ingeniare ("to contrive, devise") and ingenium ("cleverness")
1989
Edit and view HTML documents, 1991
Nexus Browser, 1994
first website: info.cern.ch
Sir Tim Berners-Lee
Professor of Engineering in the School of Engineering with a joint appointment in the Department of Electrical Engineering and Computer Science at MIT
650 words
3800 characters
If the port number is not specified, 80 is always assumed for HTTP
because port 79 and 81 were taken (RFC 1060)
The TCP-IP connection is broken by the server when the whole document has been transfered
GET /fuzzy_bunnies.txt
The response to a simple GET request is a message in hypertext mark-up language ( HTML ). This is a byte stream of ASCII characters.
No error response codes
Only GET
No cookies / sessions / server state
Sept 14, 2017
<html>
<p>More info
<a href="http://foo.com/bar.html">here</a>
</p>
</html>
I want to read this HTML document
GET /bar.html
<html>
...
</html>
HTTP relies on TCP to ask for and receive document
TCP
IP
TCP
IP
how?
where?
HTTP
HTTP performance is tied to TCP properties
Future HTTP protocols will be back-compatible with this protocol.
World Wide Web went from zero to "everywhere" in about two years
Date | # of web sites |
---|---|
'93 | 130 |
'94 | 2700 |
'95 | 23,500 |
'96 | 100,000 |
<html>
<p>See diagram
<img src="http://foo.com/bar.jpg" />
</p>
</html>
I want to see this JPEG image
RFC 1945
60 pages
GET /mypage.html HTTP/1.0
User-Agent: NCSA_Mosaic/2.0 (Windows 3.1)
200 OK
Date: Tue, 15 Nov 1994 08:12:31 GMT
Server: CERN/3.0 libwww/2.17
Content-Type: text/html
<HTML>
A page with an image
<IMG src="/myimage.gif">
</HTML>
RFC 1945
RFC 2616
27 drafts, latest 2014
The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems.
RFC 2616
make the Internet work better from an engineering point of view
The IETF's official products are documents called RFCs (Requests for Comments)
"The Tangled Web: A Guide to Securing Modern Web Applications"
by Michal Zalewski
ISBN-13: 978-1593273880
👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍👍
What if we send HTTP request to SMTP?
GET /<html><body><h1>Hi! HTTP/1.1
Host: example.com:25
220 example.com ESMTP
500 5.5.1 Invalid command: "GET /<html><body><h1>Hi! HTTP/1.1"
500 5.1.1 Invalid command: "Host: example.com:25"
...
421 4.4.1 Timeout
If the browser blindly trusts returned text to be HTML this is reflected XSS attack
Are we talking about the same thing?
GET /bar.html
<html>
title: bar
</html>
*** **
bar.html
** ***
index.html
title: bar
*** **
bar.html
Give me the page ...
request
response
* I am really ignoring all server-to-server communication happening over HTTP
*
tracking pixels / cookies
58 requests to load
this page!
Average page size: 2MB !
images, fonts, styles, scripts, ads
i.e. "avoiding page reloads"
GET /index.html
*** **
** ***
<iframe src="bar.html">
index.html
GET /bar.html
Gets the data from the server by a client side script
1999 Internet Explorer plugin
2004 Google Gmail, Kayak.com
2006 first spec draft
2014 latest spec draft
<button id="ajaxButton" type="button">Request</button>
<script>
var httpRequest;
document.getElementById("ajaxButton")
.addEventListener('click', makeRequest);
function makeRequest() {
httpRequest = new XMLHttpRequest();
httpRequest.onreadystatechange = alertContents;
httpRequest.open('GET', 'test.html');
httpRequest.send();
}
function alertContents() {
if (httpRequest.readyState === XMLHttpRequest.DONE) {
if (httpRequest.status === 200) {
alert(httpRequest.responseText);
} else {
alert('There was a problem with the request.');
}
}
}
</script>
declarative HTML
imperative JavaScript
<button id="ajaxButton" type="button">Request</button>
<script>
document.getElementById("ajaxButton")
.addEventListener('click', makeRequest);
function makeRequest() {
fetch('<some url>')
.then(r => r.text())
.then(alert, console.error)
}
</script>
2000
2012
<div> ... </div>
<script>
....
</script>
index.html
<div> ... </div>
<script>
....
</script>
index.html
HTTP/1 => HTTP/2 or QUIC
WebSockets
WebRTC
ServiceWorker
browser
browser
http://localhost or https://...
Usually a single command!
... 250k new certificates per day ...
Browser
Server
HTTPS
connections are expensive because
TCP + TLS handshakes!
data
If a tree falls while you are in the forest ...
<html manifest="example.appcache">
...
</html>
CACHE MANIFEST
# v1 2011-08-14
index.html
style.css
image1.png
# Use from network if available
NETWORK:
network.html
# Fallback content
FALLBACK:
/ fallback.html
declarative list
Turns out declaring caching strategy is hard.
ServiceWorker
Server
browser
Web Workers
ServiceWorker
Transforms
the response
Transforms
the request
navigator.serviceWorker.register(
'app/my-service-worker.js')
Chrome, Opera, Firefox
Must be https
self.addEventListener('install', ...)
self.addEventListener('activate', ...)
self.addEventListener('message', ...)
self.addEventListener('push', ...)
self.addEventListener('fetch', function (event) {
console.log(event.request.url)
event.respondWith(...)
})
// Cache API
self.addEventListener('install', e => {
const urls = ['/', 'app.css', 'app.js']
e.waitUntil(
caches.open('my-app')
.then(cache =>
cache.addAll(urls))
)
})
Cache resources on SW install
self.addEventListener('fetch', e => {
e.respondWith(
caches.open('my-app')
.then(cache =>
cache => match(e.request))
)
})
Return cached resource
Offline support using ServiceWorker
browser
ServiceWorker
Request
Response
Server
Server
browser
ServiceWorker
Request
Response
express.js
http.ClientRequest
JavaScript
http.ServerResponse
JavaScript
browser
ServiceWorker
Server
express.js
Server
browser
ServiceWorker
express.js
http.ClientRequest(Request)
http.ServerResponse(Response)
Server inside ServiceWorker
Video: https://youtu.be/4axZ3D75Llg?t=31m30s
The library: github.com/bahmutov/express-service
Blog post with examples: Run Express server in your browser
// returned JavaScript file
// from same domain
myFunc()
navigator.serviceWorker.register(
'jsonp?callback=myFunc'
)
Example: unfiltered JSONP endpoint
dynamic text to code
navigator.serviceWorker.register will try to load this code as SW
// returned SW script
my malicious SW JS
Example: XSS + unfiltered JSONP endpoint
navigator.serviceWorker.register(
'jsonp?callback="my malicious SW JS"'
)
navigator.serviceWorker.register will try to load this code as SW :(
Malicious ServiceWorker injected via XSS can be really hard to get rid of
Please protect yourself from XSS
Web Today
Source: Ilya Grigorik https://bit.ly/http2-opt
binary, multiplexed protocol - 55% speed up on top sites!
3:1
then
then
then
3:1
$ curl -I https://github.com
HTTP/1.1 200 OK
Server: GitHub.com
Date: Sat, 03 Jun 2017 03:08:50 GMT
Content-Type: text/html; charset=utf-8
Status: 200 OK
Cache-Control: no-cache
Vary: X-PJAX
X-UA-Compatible: IE=Edge,chrome=1
Set-Cookie: logged_in=no; domain=.github.com; path=/;
expires=Wed, 03 Jun 2037 03:08:50 -0000; secure; HttpOnly
Set-Cookie: _gh_sess=eyJzZXNzaW9uX2lkIjoiYzFlYTc0YTQ0OTNhZTdlNzI4MTgwNzI5N2QyNTlkNWMiLCJfY
3NyZl90b2tlbiI6IngzaHhRREMxNEpLUFNid1ZmMkc4d0N1OG1xdjZ3MkdmWmh4YkRFazNYQkU9In0%3D--b7171bd148da0b79b248fb561b8bfd4aadf16ff5; path=/; secure; HttpOnly
X-Request-Id: 75c9420698c53b1a707b10b2a7a510cc
X-Runtime: 0.056786
Content-Security-Policy: default-src 'none'; base-uri 'self'; block-all-mixed-content;
child-src render.githubusercontent.com; connect-src 'self' uploads.github.com
status.github.com collector.githubapp.com api.github.com www.google-analytics.com
github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com
github-production-user-asset-6210df.s3.amazonaws.com wss://live.github.com; font-src
assets-cdn.github.com; form-action 'self' github.com gist.github.com; frame-ancestors
'none'; img-src 'self' data: assets-cdn.github.com identicons.github.com
collector.githubapp.com github-cloud.s3.amazonaws.com *.githubusercontent.com;
media-src 'none'; script-src assets-cdn.github.com; style-src 'unsafe-inline'
assets-cdn.github.com
Strict-Transport-Security: max-age=31536000; includeSubdomains; preload
Public-Key-Pins: max-age=5184000; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18="; pin-sha256="RRM1dGqnDFsCJXBTHky16vi1obOlCgFFn/yOhI/y+ho="; pin-sha256="k2v657xBsOVe1PQRwOsHsw3bsGT2VzIqz5K+59sNQws="; pin-sha256="K87oWBWM9UZfyddvDfoxL+8lpNyoUB2ptGtn0fv6G2Q="; pin-sha256="IQBnNBEiFuhj+8x6X8XLgh01V9Ic5/V3IRQLNFFc7v4="; pin-sha256="iie1VXtL7HzAMF+/PVPR9xzT80kQxdZeJ+zduCB3uj0="; pin-sha256="LvRiGEjRqfzurezaWuj8Wie2gyHMrW5Q06LspMnox7A="; includeSubDomains
X-Content-Type-Options: nosniff
X-Frame-Options: deny
X-XSS-Protection: 1; mode=block
X-Runtime-rack: 0.060928
Vary: Accept-Encoding
X-Served-By: e878d09eac725c89f5f15204c1326660
X-GitHub-Request-Id: F1E1:2F2B:6F2906B:A48C07A:59322842
2100 characters
~55% of HTTP/0.9 spec!
pseudo headers
Common headers static table on client and server
was 8% Jun '16
SPDY is at 8%
“One great metric around that which I enjoy is the fraction of connections created that carry just a single HTTP transaction (and thus make that transaction bear all the overhead). For HTTP/1 74% of our active connections carry just a single transaction – persistent connections just aren’t as helpful as we all want. But in HTTP/2 that number plummets to 25%. That’s a huge win for overhead reduction.” — Patrick McManus, Mozilla.
H2 is succeeding in minimizing delays
chrome://net-internals https://www.youtube.com/watch?v=bG2GhHkPP4Q
<html>
<img src="photo.jpg"/>
</html>
...
<img src="photo.jpg"/>
...
GET index.html
Server
GET photo.jpg
Client
?
photo.jpg
index.html
time
<html>
<img src="photo.jpg"/>
</html>
Server
GET index.html
Client
photo.jpg
index.html
...
<img src="photo.jpg"/>
...
GET photo.jpg
Already have it!
time
<html>
<link rel="stylesheet"
href="app.css">
<script src="app.js">
</script>
</html>
Server
GET index.html
Client
app.css
index.html
app.js
Great way to replace HTTP/1.1 practice of inlining small styles, scripts, images in the page
time
const server = require('spdy')
const express = require('express')
const app = express()
app.use(express.static('public'))
const tlsOptions = {
key: fs.readFileSync('./server.key'),
cert: fs.readFileSync('./server.crt')
}
const port = 5000
server.createServer(tlsOptions, app)
.listen(port, err => { ... })
Make server using spdy
const image = fs.readFileSync('./image.jpg')
const options = {
request: {accept: 'image/*'},
response: {'content-type': 'image/jpeg'}
}
function serveHome (req, res) {
if (res.push) {
const stream = res.push('/image.jpg', image, options)
stream.end(image)
}
}
app.get('/', serveHome)
Push resource if client/server/proxy supports it
No NGINX support as of March '17, use https://github.com/indutny/bud
*
Server Push always pushed the resources even if they might be already in the browser's cache.
"Being Pushy" by Yoav Weiss
"HTTP/2 push is tougher than I thought" by Jake Archibald
<html>
<link rel="stylesheet"
href="app.css">
<script src="app.js">
</script>
</html>
Server
GET index.html
Client
index.html
I have "app.css" <hash>,
"app.js" <hash>
Not going to push app.css and app.js
time
H2 changes some performance best practices
Source: Ilya Grigorik https://bit.ly/http2-opt
Multiple streams over same TCP connection
Headers and binary frames
Dependency mechanism among resources
Server preemptively pushes resources to the client
multiplexing
compression
flow control
IIS10 - asking for same stream twice craches the server
Slow Read: 1 TCP connection, asking for lots of resources with small window forces the server thread pool to grow (all servers)
dependency cycle DoS (Apache, nghttpd)
Set max user agent, then send requests - the memory blows up
Web Today
TCP: all have to wait!
video: https://vimeo.com/190932569
PLR - Packet Loss Rate
Order of packets guarantee
'99
2009
2015
HTTP/1.1
SPDY
HTTP/2
QUIC
Secret: many Google properties communicate with Chrome using QUIC
HTTP/0.9
'91
'92
HTTP/1 - SPDY - HTTP/2 - QUIC timeline
2013
(deprecated)
Enable QUIC: chrome://flags/#enable-quic
chrome://net-internals/#quic
Visit Google property
"How Secure and Quick is QUIC? Provable Security and Performance Analyses" 2015 IEEE Symposium on Security and Privacy,
S&P 2015 https://eprint.iacr.org/2015/582.pdf
We find that attacking QUIC is not easier than TCP and TLS.
The attacks are mostly DoS that prevent 0-RTT connections
Server
Content is coming from the server
As more users connect ...
Server
We need a bigger server 💸
Server
We need a bigger server 💸
🔥🔥🔥
HBO's Silicon Valley "Two Days of the Condor", season 2 finale
P2P using data channel in WebRTC
Server
offline != server is down
A peer-to-peer hypermedia protocol
to make the web faster, safer, and more open.
Real-Time Communications capabilities via simple APIs
Dat is the distributed data sharing tool
a streaming torrent client for the web browser and the desktop
Every version of HTTP
and related protocols was a solution to a real-world problem
Most of the time the problem was not apparent until real world adoption
Tomorrow a new protocol will appear that will try to solve HTTP/2's and QUIC's problems
Tomorrow a new protocol will appear that will try to solve HTTP/2's and QUIC's problems
By Gleb Bahmutov
Let's walk through the history of HTTP, its original design goals and how they shape the modern Internet. We will look at the current standard, and what is coming tomorrow. We will also learn and refresh a whole bunch of acronyms: TCP, HTTP, RFC, IETF, UPD, QUIC. Every developer working or even using WWW will benefit from this presentation.
JavaScript ninja, image processing expert, software quality fanatic