#BehindAReverseProxy

Be proud of being lazy!

​July 27th 2016 – viscaweb.com/meetings

First part:

What's HTTP?

#BehindAReverseProxy

#BehindAReverseProxy

#BehindAReverseProxy

What's HTTP?

HTTP functions as a request–response protocol in the client–server computing model (as said on Wikipedia).

#BehindAReverseProxy

What's HTTP?

#BehindAReverseProxy

What's HTTP?

Request Type

GET
HEAD
POST
OPTIONS
CONNECT
TRACE
PUT
PATCH
DELETE

GET

POST

Response Codes

2xx

200 OK 🙏

3xx

301 (used for SEO)
304 Not Modified

4xx
403 Forbidden

404 Not Found

5xx
500 Internal Server Error 😔

etc..

 

418 I'm a teapot

 

 

 

 

 

 

 

 

 

 

#BehindAReverseProxy

What's HTTP?

Request Headers

Response Headers

Accept-Charset    
Accept-Encoding    
Accept-Language    
Authorization
Cookie
Referer
User-Agent    
X-Forwarded-For
etc..

Cache-Control
Content-Encoding
Etag
Expires
Location
Refresh
Set-Cookie
Status
Vary
etc..

#BehindAReverseProxy

What's HTTP?

#BehindAReverseProxy

  Where does a reverse proxy stands?

#BehindAReverseProxy

  Where does a reverse proxy stands?

A Reverse Proxy stands between your web server and the user.

It provides extra functionalities:
caching, security, load balancing, etc..

Second part:

Caching - how does it works?

#BehindAReverseProxy

#BehindAReverseProxy

#BehindAReverseProxy

Preamble

We'll review how the headers describe
a caching strategy.
This is JUST about HTTP.

The Reverse Proxy understands those headers...
as you web browser does.

#BehindAReverseProxy

Playing with Privacy

Public

Private

<?php
header("Cache-Control: public");
<?php
header("Cache-Control: private");

Public means that caching this page is allowed and safe.

A Private response will not be cached by any (decent) proxy.

#BehindAReverseProxy

Playing with Expiration

Liga, 04/12, 8PM

User

Reverse Proxy

Web Server

Generate Hash

$ curl -I 'http://marcadores.com/futbol/spain/liga/fc-barcelona-madrid'

GET /futbol/spain/liga/fc-barcelona-madrid HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: public
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Expires: Sun, 04 Dec 2016 20:00:00 +0200
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public");

$expires = new \DateTime('2016-12-04 20:00:00', new \DateTimeZone('Europe/Madrid'));
header("Expires: " . $expires->format('r'));

#BehindAReverseProxy

Playing with Expiration

Liga, 04/12, 8PM

User

Reverse Proxy

Web Server

Generate Hash

Save this page with the given expiration date: 2016-12-04 18:00:00 UTC

#BehindAReverseProxy

Playing with Expiration

Liga, 04/12, 8PM

User

Reverse Proxy

Web Server

Generate Hash

#BehindAReverseProxy

User

Reverse Proxy

Web Server

Generate Hash

$ curl -I 'http://marcadores.com/'

GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: public, max-age=60
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public, max-age=60");

Playing with Max-Age

#BehindAReverseProxy

User

Reverse Proxy

Web Server

Generate Hash

Playing with Max-Age

Save this page with the given max-age: 60s

 ✔ Cached for 60s    

#BehindAReverseProxy

User

Reverse Proxy

Web Server

Generate Hash

$ curl -I 'http://marcadores.com/'

GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive

HTTP/1.1 200 OK
Cache-Control: public
Etag: version1
Connection: Keep-Alive
Content-Type: text/html; charset=UTF-8
Date: Mon, 25 Jul 2016 19:31:04 GMT
Server: Apache/2.2.31 (Unix) DAV/2 PHP/5.6.18 mod_ssl/2.2.31 OpenSSL/1.0.2h
X-Powered-By: PHP/5.6.18
<?php
header("Cache-Control: public");

$version = '1';
header("Etag: version{$version}");

Playing with Etag

#BehindAReverseProxy

User

Reverse Proxy

Web Server

Generate Hash

Playing with Etag

#BehindAReverseProxy

User

Reverse Proxy

Web Server

Generate Hash

Playing with Etag

➜ Is there an Etag in the Request?
➜ Is this Etag different from the one we know?

#BehindAReverseProxy

A flexible app. with invalidation

(the concept)

Reverse Proxy

What's invalidation? It ask your reverse proxy to remove from their storage one or many objects (Response).

How? By making a special method request, using a specific method: PURGE, DELETE, BAN.

$ curl -X PURGE http://www.marcadores.com/liga
✔ Purged

Who? For Varnish, you usually allow only a given list of IPs. Each kind of server has its own implementation. SDK will go through a classic auth (CloudFlare).

#BehindAReverseProxy

A flexible app. with invalidation

(using patterns)

How does it works?

$ curl -X BAN http://www.marcadores.com/ 'X-Url: /futbol/2016-03-07/*'
✔ Banned

Fully supported by:

Supported except regular exp:

#BehindAReverseProxy

A flexible app. with invalidation

(using tagging)

Let's imagine:
➜ Your managing the caching strategy for GitHub.com
➜ Each blocks is called a widget
➜ Each widget must be cached and always up to date
➜ A user logs-in and change his name...

#BehindAReverseProxy

A flexible app. with invalidation

(using tagging)

You must invalidate the following URL: github.com/jonashaouzi

#BehindAReverseProxy

A flexible app. with invalidation

(using tagging)

And... all the repositories graph. related to this user:

#BehindAReverseProxy

A flexible app. with invalidation

(using tagging)

How to solve this problem?

1. Tag your pages

Imagining we have 3 pages related with this user:
1. His profile page (github.com/jonashaouzi)

2. His first repository (github.com/orga/repo)

3. His second repository (github.com/orga/repo2)

Simply tag them as: user123

<?php
header('Cache-Tag: user_page, user123, charts');
$ curl -X BAN http://www.marcadores.com
\ 'Cache-Tag: user123'
✔ Banned

2. Invalidating looks way easier now:

Fully supported by:

#BehindAReverseProxy

Don't let them pay for it!

User

Reverse Proxy

Web Server

#BehindAReverseProxy

Don't let them pay for it!

User

Reverse Proxy

Web Server

Third part:

Reverse Proxies

#BehindAReverseProxy

#BehindAReverseProxy

#BehindAReverseProxy

This engine is used by many top world's websites:

#8

#121

#155

#3

... and 1330 of the first 10 000 world's websites.

#BehindAReverseProxy

Varnish is an HTTP accelerator:

 

Written in C.. which make it fast

➜ Own language (VCL)
Made for scaling

➜ Supports load balancing

➜ Supports compression

➜ Supports all kinds of invalidation

sub vcl_recv {
  # Called at the beginning of a request, after the complete request has been received and parsed.
  # Its purpose is to decide whether or not to serve the request, how to do it, and, if applicable,
  # which backend to use.
  # also used to modify the request

  set req.backend_hint = vdir.backend(); # send all traffic to the vdir director

  # Normalize the header, remove the port (in case you're testing this on various TCP ports)
  set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
  
  # Remove the proxy header (see https://httpoxy.org/#mitigate-varnish)
  unset req.http.proxy;

  # Normalize the query arguments
  set req.url = std.querysort(req.url);

#BehindAReverseProxy

Nginx is and a HTTP Server, and a Reverse Proxy Server
 

Written in C.. which make it fast

➜ Known to be more efficient (less memory, faster) than Apache

Supports load balancing

Supports compression
Nginx was found to be the second most widely used web server across all "active" sites (October 2015)

 3,779 websites of the 10,000 uses it

 

Internal

Reverse Proxy

#BehindAReverseProxy

Written in PHP

Not recommended for high traffic websites (as the main RP)


Useful for local testing

<?php
// web/app.php
use Symfony\Component\HttpFoundation\Request;

// ...
$kernel = new AppKernel('prod', false);
$kernel->loadClassCache();
// wrap the default AppKernel with the AppCache one
$kernel = new AppCache($kernel);

$request = Request::createFromGlobals();

$response = $kernel->handle($request);
$response->send();

$kernel->terminate($request, $response);

CDNs

#BehindAReverseProxy

CDNs

#BehindAReverseProxy

Good to know... many CDNs allow you to overwrite caching rules.

Fourth part:

Thoughts and notes

#BehindAReverseProxy

#BehindAReverseProxy

Symfony SubRequests

#BehindAReverseProxy

Symfony can easily render the content of a given Controller into any part of the page. This is call a SubRequest.

By default, Symfony render this SubRequest transparently and inside the same PHP thread.

// app/config/config.yml
framework:
    esi: {
        enabled: true
    }

Symfony SubRequests

#BehindAReverseProxy

FOS HTTP Bundle

#BehindAReverseProxy

Provides easy PHP integration with Symfony of:

Global caching rules
Invalidation
Tagging
User Context
➜ Testing

Vary

The vary can be understand as if it is a "Context".

Let's imagine this scenario:
➜ Your website offers a full and enriched mobile version, and a desktop version, using the exact same URL.

➜ So far, this is working perfectly. You detect the user-agent and display, accordingly, the mobile or desktop version.

You decide to integrate a caching system on top of the website.

➜ A feedback raises your mail box: using its phone, the user got the desktop version. How it that possible?

➜ Your reverse proxy has cached the page, and because the user who went on the page for the first time was on his computer, all people now get the desktop version. Whether they're on mobile or not.

#BehindAReverseProxy

Vary

if (req.http.User-Agent ~ "(Mobile|Android|iPhone|iPad)") {
  set req.http.User-Agent = "mobile";
} else {
  set req.http.User-Agent = "desktop";
}

Works for:
 

AB Testing             User Agent

Encoding             Timezone
etc..

The vary can be understand as if it is a "Context".

$ curl -v 'http://www.marcadores.com'
[...]
Vary: User-Agent
[...]

#BehindAReverseProxy

Scalability

Reverse

Proxy

Web

Server

Users

A normal web server structure: 2 RP, 2 Web Servers

#BehindAReverseProxy

Scalability

Reverse

Proxy

Web

Server

Users

A normal web server structure: 8 RP, 2 Web Servers

Works only if the hit ratio is high

Cost much lower than if scaling the Web Server

#BehindAReverseProxy

Security

Reverse

Proxy

Web

Server

Users

A Reverse Proxy can filter the request

acl forbidden {
        "192.168.168.0"/24;
        "10.10.10.0"/24;
}

sub vcl_recv {

       # Block access from these hosts
        if (client.ip ~ forbidden) {
            error 403 "Forbidden";
        }

        return (lookup);
 }

#BehindAReverseProxy

PSR7

php-fig.org/psr/psr-7

New interfaces:

MessageInterface
RequestInterface extends MessageInterface
ResponseInterface extends MessageInterface
ServerRequestInterface extends RequestInterface
StreamInterface
UploadedFileInterface
UriInterface

 

19th May 2016 RFC accepted
✔ Available in Symfony >= 2.7

#BehindAReverseProxy

And more...

To learn more about this topic and train, I'm suggesting...

Katas:

Create your own reverse proxy

➜ Implementing a complex caching strategy

Testing invalidating and warming up

 

Meetings:

➜ Varnish 4 - what's new?

Caching and member area

#BehindAReverseProxy

Links (1/3)

HTTP by Wikipedia
https://es.wikipedia.org/wiki/Hypertext_Transfer_Protocol

HTTP headers list
https://es.wikipedia.org/wiki/List_of_HTTP_header_fields

 

HTTP codes list
https://es.wikipedia.org/wiki/List_of_HTTP_status_codes

 

A reverse proxy by Wikipedia
https://es.wikipedia.org/wiki/Reverse_proxy

#BehindAReverseProxy

Links (2/3)

How the HTTP caching header works?
https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/http-caching
 

Varnish
https://varnish-cache.org/intro/index.html#intro
 

Nginx
https://www.nginx.com/
 

Symfony Reverse Proxy
http://symfony.com/doc/current/book/http_cache.html#symfony-reverse-proxy

#BehindAReverseProxy

Links (3/3)

Symfony SubRequests
http://symfony.com/doc/current/components/http_kernel/introduction.html
 

FOS HTTP Bundle
https://github.com/FriendsOfSymfony/FOSHttpCacheBundle
 

Understanding the Vary header
https://www.fastly.com/blog/best-practices-for-using-the-vary-header
 

Stats about Varnish usage
http://trends.builtwith.com/Web-Server/Varnish
 

Stats about Nginx usage
http://trends.builtwith.com/Web-Server/nginx

#BehindAReverseProxy

​May 27th 2016 – viscaweb.com/meetings

by

#BehindAReverseProxy

Be proud of being lazy!