Be proud of being lazy!
July 27th 2016 – viscaweb.com/meetings
First part:
#BehindAReverseProxy
#BehindAReverseProxy
HTTP functions as a request–response protocol in the client–server computing model (as said on Wikipedia).
#BehindAReverseProxy
#BehindAReverseProxy
Request Type
GET
HEAD
POST
OPTIONS
CONNECT
TRACE
PUT
PATCH
DELETE
GET
POST
Response Codes
418 I'm a teapot
#BehindAReverseProxy
Request Headers
Response Headers
Accept-Charset
Accept-Encoding
Accept-Language
Authorization
Cookie
Referer
User-Agent
X-Forwarded-For
etc..
Cache-Control
Content-Encoding
Etag
Expires
Location
Refresh
Set-Cookie
Status
Vary
etc..
#BehindAReverseProxy
#BehindAReverseProxy
#BehindAReverseProxy
A Reverse Proxy stands between your web server and the user.
It provides extra functionalities:
caching, security, load balancing, etc..
Second part:
#BehindAReverseProxy
#BehindAReverseProxy
We'll review how the headers describe
a caching strategy.
This is JUST about HTTP.
The Reverse Proxy understands those headers...
as you web browser does.
#BehindAReverseProxy
Public
Private
<?php
header("Cache-Control: public");
<?php
header("Cache-Control: private");
Public means that caching this page is allowed and safe.
A Private response will not be cached by any (decent) proxy.
#BehindAReverseProxy
Liga, 04/12, 8PM
User
Reverse Proxy
Web Server
Generate Hash
$ curl -I 'http://marcadores.com/futbol/spain/liga/fc-barcelona-madrid'
GET /futbol/spain/liga/fc-barcelona-madrid HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: public
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Expires: Sun, 04 Dec 2016 20:00:00 +0200
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public");
$expires = new \DateTime('2016-12-04 20:00:00', new \DateTimeZone('Europe/Madrid'));
header("Expires: " . $expires->format('r'));
#BehindAReverseProxy
Liga, 04/12, 8PM
User
Reverse Proxy
Web Server
Generate Hash
✔ Save this page with the given expiration date: 2016-12-04 18:00:00 UTC
#BehindAReverseProxy
Liga, 04/12, 8PM
User
Reverse Proxy
Web Server
Generate Hash
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Generate Hash
$ curl -I 'http://marcadores.com/'
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: public, max-age=60
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public, max-age=60");
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Generate Hash
✔ Save this page with the given max-age: 60s
✔ Cached for 60s
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Generate Hash
$ curl -I 'http://marcadores.com/'
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
HTTP/1.1 200 OK
Cache-Control: public
Etag: version1
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public");
$version = '1';
header("Etag: version{$version}");
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Generate Hash
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Generate Hash
➜ Is there an Etag in the Request?
➜ Is this Etag different from the one we know?
#BehindAReverseProxy
Reverse Proxy
What's invalidation? It ask your reverse proxy to remove from their storage one or many objects (Response).
How? By making a special method request, using a specific method: PURGE, DELETE, BAN.
$ curl -X PURGE http://www.marcadores.com/liga
✔ Purged
Who? For Varnish, you usually allow only a given list of IPs. Each kind of server has its own implementation. SDK will go through a classic auth (CloudFlare).
#BehindAReverseProxy
How does it works?
$ curl -X BAN http://www.marcadores.com/ 'X-Url: /futbol/2016-03-07/*'
✔ Banned
Fully supported by:
Supported except regular exp:
#BehindAReverseProxy
Let's imagine:
➜ Your managing the caching strategy for GitHub.com
➜ Each blocks is called a widget
➜ Each widget must be cached and always up to date
➜ A user logs-in and change his name...
#BehindAReverseProxy
You must invalidate the following URL: github.com/jonashaouzi
#BehindAReverseProxy
And... all the repositories graph. related to this user:
#BehindAReverseProxy
How to solve this problem?
1. Tag your pages
Imagining we have 3 pages related with this user:
1. His profile page (github.com/jonashaouzi)
2. His first repository (github.com/orga/repo)
3. His second repository (github.com/orga/repo2)
Simply tag them as: user123
<?php
header('Cache-Tag: user_page, user123, charts');
$ curl -X BAN http://www.marcadores.com
\ 'Cache-Tag: user123'
✔ Banned
2. Invalidating looks way easier now:
Fully supported by:
#BehindAReverseProxy
User
Reverse Proxy
Web Server
#BehindAReverseProxy
User
Reverse Proxy
Web Server
Third part:
#BehindAReverseProxy
#BehindAReverseProxy
This engine is used by many top world's websites:
... and 1330 of the first 10 000 world's websites.
#BehindAReverseProxy
Varnish is an HTTP accelerator:
➜ Written in C.. which make it fast
➜ Own language (VCL)
➜ Made for scaling
➜ Supports load balancing
➜ Supports compression
➜ Supports all kinds of invalidation
sub vcl_recv {
# Called at the beginning of a request, after the complete request has been received and parsed.
# Its purpose is to decide whether or not to serve the request, how to do it, and, if applicable,
# which backend to use.
# also used to modify the request
set req.backend_hint = vdir.backend(); # send all traffic to the vdir director
# Normalize the header, remove the port (in case you're testing this on various TCP ports)
set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
# Remove the proxy header (see https://httpoxy.org/#mitigate-varnish)
unset req.http.proxy;
# Normalize the query arguments
set req.url = std.querysort(req.url);
#BehindAReverseProxy
Nginx is and a HTTP Server, and a Reverse Proxy Server
➜ Written in C.. which make it fast
➜ Known to be more efficient (less memory, faster) than Apache
➜ Supports load balancing
➜ Supports compression
➜ Nginx was found to be the second most widely used web server across all "active" sites (October 2015)
➜ 3,779 websites of the 10,000 uses it
#BehindAReverseProxy
➜ Written in PHP
➜ Not recommended for high traffic websites (as the main RP)
➜ Useful for local testing
<?php
// web/app.php
use Symfony\Component\HttpFoundation\Request;
// ...
$kernel = new AppKernel('prod', false);
$kernel->loadClassCache();
// wrap the default AppKernel with the AppCache one
$kernel = new AppCache($kernel);
$request = Request::createFromGlobals();
$response = $kernel->handle($request);
$response->send();
$kernel->terminate($request, $response);
#BehindAReverseProxy
#BehindAReverseProxy
Good to know... many CDNs allow you to overwrite caching rules.
Fourth part:
#BehindAReverseProxy
#BehindAReverseProxy
Symfony can easily render the content of a given Controller into any part of the page. This is call a SubRequest.
By default, Symfony render this SubRequest transparently and inside the same PHP thread.
// app/config/config.yml
framework:
esi: {
enabled: true
}
#BehindAReverseProxy
#BehindAReverseProxy
Provides easy PHP integration with Symfony of:
➜ Global caching rules
➜ Invalidation
➜ Tagging
➜ User Context
➜ Testing
The vary can be understand as if it is a "Context".
Let's imagine this scenario:
➜ Your website offers a full and enriched mobile version, and a desktop version, using the exact same URL.
➜ So far, this is working perfectly. You detect the user-agent and display, accordingly, the mobile or desktop version.
You decide to integrate a caching system on top of the website.
➜ A feedback raises your mail box: using its phone, the user got the desktop version. How it that possible?
➜ Your reverse proxy has cached the page, and because the user who went on the page for the first time was on his computer, all people now get the desktop version. Whether they're on mobile or not.
#BehindAReverseProxy
if (req.http.User-Agent ~ "(Mobile|Android|iPhone|iPad)") {
set req.http.User-Agent = "mobile";
} else {
set req.http.User-Agent = "desktop";
}
Works for:
➜ AB Testing ➜ User Agent
➜ Encoding ➜ Timezone
➜ etc..
The vary can be understand as if it is a "Context".
$ curl -v 'http://www.marcadores.com'
[...]
Vary: User-Agent
[...]
#BehindAReverseProxy
Reverse
Proxy
Web
Server
Users
A normal web server structure: 2 RP, 2 Web Servers
#BehindAReverseProxy
Reverse
Proxy
Web
Server
Users
A normal web server structure: 8 RP, 2 Web Servers
Works only if the hit ratio is high
Cost much lower than if scaling the Web Server
#BehindAReverseProxy
Reverse
Proxy
Web
Server
Users
A Reverse Proxy can filter the request
acl forbidden {
"192.168.168.0"/24;
"10.10.10.0"/24;
}
sub vcl_recv {
# Block access from these hosts
if (client.ip ~ forbidden) {
error 403 "Forbidden";
}
return (lookup);
}
#BehindAReverseProxy
php-fig.org/psr/psr-7
New interfaces:
➜ MessageInterface
➜ RequestInterface extends MessageInterface
➜ ResponseInterface extends MessageInterface
➜ ServerRequestInterface extends RequestInterface
➜ StreamInterface
➜ UploadedFileInterface
➜ UriInterface
✔ 19th May 2016 RFC accepted
✔ Available in Symfony >= 2.7
#BehindAReverseProxy
To learn more about this topic and train, I'm suggesting...
Katas:
➜ Create your own reverse proxy
➜ Implementing a complex caching strategy
➜ Testing invalidating and warming up
Meetings:
➜ Varnish 4 - what's new?
➜ Caching and member area
#BehindAReverseProxy
HTTP by Wikipedia
https://es.wikipedia.org/wiki/Hypertext_Transfer_Protocol
HTTP headers list
https://es.wikipedia.org/wiki/List_of_HTTP_header_fields
HTTP codes list
https://es.wikipedia.org/wiki/List_of_HTTP_status_codes
A reverse proxy by Wikipedia
https://es.wikipedia.org/wiki/Reverse_proxy
#BehindAReverseProxy
How the HTTP caching header works?
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching
Varnish
https://varnish-cache.org/intro/index.html#intro
Nginx
https://www.nginx.com/
Symfony Reverse Proxy
http://symfony.com/doc/current/book/http_cache.html#symfony-reverse-proxy
#BehindAReverseProxy
Symfony SubRequests
http://symfony.com/doc/current/components/http_kernel/introduction.html
FOS HTTP Bundle
https://github.com/FriendsOfSymfony/FOSHttpCacheBundle
Understanding the Vary header
https://www.fastly.com/blog/best-practices-for-using-the-vary-header
Stats about Varnish usage
http://trends.builtwith.com/Web-Server/Varnish
Stats about Nginx usage
http://trends.builtwith.com/Web-Server/nginx
#BehindAReverseProxy
May 27th 2016 – viscaweb.com/meetings
by
Be proud of being lazy!