Kirk Haines

khaines@engineyard.com

RubyKaigi 2016

Web Server Concurrency

https://github.com/engineyard/rubykaigi2016-concurrency

Kirk Haines

• Rubyist since 2001

• First professional Ruby web app in 2002

• Dozens of sites and apps, and several web servers since

• Engine Yard since 2008

• Former Ruby 1.8.6 maintainer (EOL announced at RubyKaigi 2011!!!)

khaines@engineyard.com

@wyhaines

What Is a Web Server?

A web server is an information technology that processes requests via HTTP, the basic network protocol used to distribute information on the World Wide Web.

https://en.wikipedia.org/wiki/Web_server

A web server is just a server that accepts HTTP requests, and returns HTTP responses.

"A Shell Server for HTTP"

(slightly modified from the original)

http://info.cern.ch/hypertext/WWW/Provider/ShellScript.html

#! /bin/sh
read get docid junk
cat `echo "$docid"` | \
  ruby -p -e '$_.gsub(/^\/ [\r\n]/,"").chomp'

That's Terrifying

i.e. Don't Actually Do This

(super_simple.sh)

A (scary, limited, blocking, single threaded) web server!

#! /bin/sh
read get docid junk
cat `echo "$docid" | \
  ruby -p -e '$_.gsub!(/^\/ [\r\n]/,"").chomp'`

netcat -l -p 5000 -e ./super_simple.sh

If you know the perl language, then that is a powerful (if otherwise incomprehensible) language with which to hack together a server.

-- http://info.cern.ch/hypertext/WWW/Provider/ShellScript.html

And if that isn't endorsement enough....

Web Server Architecture

There is nothing special about web server architecture.

It is just server architecture.

A web server is nothing more than a server that receives HTTP requests and returns HTTP responses, typically through standard socket communications.

require 'socket'

def run
  simple_server = TCPServer.new("0.0.0.0", 8080)

  loop do
    connection = simple_server.accept
    handle connection
  end
end

def get_request connection
  connection.gets
end

def process request
  "OK"
end

def handle connection
 request = get_request connection
 response = process request
 connection.puts response
 connection.close
end

run

Basic TCP Server

Short Tangent

Simplest Ruby Web Server....

Short Tangent

Simplest Ruby Web Server....

ruby -run -e httpd -- -p 8080 .

Short Tangent

Simplest Ruby Web Server....

ruby -run -e httpd -- -p 8080 .

OK....That's kind of cheating.

# frozen_string_literal: false
#
# = un.rb
#
# Copyright (c) 2003 WATANABE Hirofumi <eban@ruby-lang.org>
#
# This program is free software.
# You can distribute/modify this program under the same terms of Ruby.
#
# == Utilities to replace common UNIX commands in Makefiles etc
#
# == SYNOPSIS
#
#   ruby -run -e cp -- [OPTION] SOURCE DEST
#   ruby -run -e ln -- [OPTION] TARGET LINK_NAME
#   ruby -run -e mv -- [OPTION] SOURCE DEST
#   ruby -run -e rm -- [OPTION] FILE
#   ruby -run -e mkdir -- [OPTION] DIRS
#   ruby -run -e rmdir -- [OPTION] DIRS
#   ruby -run -e install -- [OPTION] SOURCE DEST
#   ruby -run -e chmod -- [OPTION] OCTAL-MODE FILE
#   ruby -run -e touch -- [OPTION] FILE
#   ruby -run -e wait_writable -- [OPTION] FILE
#   ruby -run -e mkmf -- [OPTION] EXTNAME [OPTION]
#   ruby -run -e httpd -- [OPTION] DocumentRoot
#   ruby -run -e help [COMMAND]

Short Tangent

def httpd
  setup("", "BindAddress=ADDR", "Port=PORT", "MaxClients=NUM", "TempDir=DIR",
        "DoNotReverseLookup", "RequestTimeout=SECOND", "HTTPVersion=VERSION") do
    |argv, options|
    require 'webrick'
    opt = options[:RequestTimeout] and options[:RequestTimeout] = opt.to_i
    [:Port, :MaxClients].each do |name|
      opt = options[name] and (options[name] = Integer(opt)) rescue nil
    end
    options[:Port] ||= 8080     # HTTP Alternate
    options[:DocumentRoot] = argv.shift || '.'
    s = WEBrick::HTTPServer.new(options)
    shut = proc {s.shutdown}
    siglist = %w"TERM QUIT"
    siglist.concat(%w"HUP INT") if STDIN.tty?
    siglist &= Signal.list.keys
    siglist.each do |sig|
      Signal.trap(sig, shut)
    end
    s.start
  end
end

OK.... Server Architecture

• Setup

• Command line arguments

• Configuration

• Listen on socket(s)

• Enter main loop

• Accept socket connection

• Handle connection

Listen on Sockets

AKA Network Communications

A server needs a way to receive requests and to return responses.

The two most common options:

Native Ruby Networking Support (TCPServer and friends)
EventMachine

Listen on Sockets

Native Ruby Networking Support

Ruby has a rich set of networking libraries, making it easy to write TCP clients and servers.

require 'socket'

class SimpleServer < TCPServer

  def initialize( address, port )
    super( address, port )
  end

  def run( address = '127.0.0.1', port = 80 )
    server = TCPServer.new(address, port
    loop do
      socket = server.accept
      handle_request( socket.gets )
    end
  end

  def handle_request( req )
    # Do Stuff with req
  end

end

A Simple Ruby Web Server

Scrawls

# gem install scrawls

( Scrawls is currently woefully incomplete; send me PRs! )

http://github.com/wyhaines/scrawls

Scrawls

Pluggable IO engines

Pluggable HTTP parsing

Shared core makes it easy to use to look at the impact of different concurrency options -- only the concurrency implementation changes between runs.

Scrawls

scrawls [OPTIONS]

scrawls is a simple ruby web server.

-h, --help:
  Show this help.

-d DIR, --docroot DIR:
  Provide a specific directory for the docroot for this server.

-a APP, --app APP:
  Ruby file containing a rack app to use.

-i IO_ENGINE, --ioengine IO_ENGINE:
  Tell the webserver which concurrency engine to use. 
  Installed IO Engines:
    multiprocess
    multithread
    simplereactor
    single

-h HTTP_ENGINE, --httpengine HTTP_ENGINE:
  Tell the webserver which concurrency engine to use. 
  Installed HTTP Engines:
    httprecognizer

-p PORT, --port PORT:
  The port for the web server to listen on. If this flag is not used, the web
  server defaults to port 80.

-b HOSTNAME, --bind HOSTNAME:
  The hostname/IP to bind to. This defaults to 127.0.0.1 if it is not provided.

Scrawls

Pluggable IO engines

Single Threaded
Multiprocess
Multithreaded (+ Multiprocess)
Evented (pure Ruby implementation)

Main Loop

i.e. Concurrency Options

How a server handles concurrency is fundamental to it's design.

require 'socket'

class SimpleServer < TCPServer

  def initialize( address, port )
    super( address, port )
  end

  def run( address = '127.0.0.1', port = 80 )
    server = TCPServer.new(address, port
    loop do
      socket = server.accept
      handle_request( socket.gets )
    end
  end

  def handle_request( req )
    # Do Stuff with req
  end

end

The simplest approach is this:

Single threaded
Blocking -- server waits until a request is handled before accepting and handling a new one

Let's Try Some Things

Test server is an 8 core VM running Ubuntu 16.04

All examples ran with Ruby 2.3.1

Apache Bench (ab) used as a simple benchmarking tool

Single Threaded Example

scrawls --ioengine single

Single threaded, blocking IO

ab -n 100000 -c 1 http://127.0.0.1:8080/test.txt

Document Path:          /test.txt
Document Length:        1078 bytes

Concurrency Level:      1
Time taken for tests:   27.673 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      126000000 bytes
HTML transferred:       107800000 bytes
Requests per second:    3613.62 [#/sec] (mean)
Time per request:       0.277 [ms] (mean)
Time per request:       0.277 [ms] (mean, across all concurrent requests)
Transfer rate:          4446.44 [Kbytes/sec] received

Single Threaded Example

scrawls --ioengine single

Single threaded, blocking IO

3614 request/second?

That's not too bad. Right?

Single Threaded Example

scrawls --ioengine single

Single threaded, blocking IO

The devil is in the details.

Fast response (a small text file)
Little network latency

Single Threaded Example

The real world is slow, though.

Query databases
Interact with microservices
Generate content
Assemble it all
Return it over slow networks

This all takes time.
Web servers spend a lot of time waiting on other things.

Single Threaded Example

scrawls --ioengine single --app slow.rb

Single threaded blocking IO, with app that takes one second to generate a response

ab -n 20 -c 1 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      1
Time taken for tests:   20.016 seconds
Complete requests:      20
Failed requests:        0
Non-2xx responses:      20
Total transferred:      880 bytes
HTML transferred:       320 bytes
Requests per second:    1.00 [#/sec] (mean)
Time per request:       1000.817 [ms] (mean)
Time per request:       1000.817 [ms] (mean, across all concurrent requests)
Transfer rate:          0.04 [Kbytes/sec] received

Main Loop

Blocking Single Threaded Server

How To Address This?

Concurrency!

Multiprocessing

Multithreading

Event Based

How To Address This?

Concurrency!

Concurrency: decomposability of a problem into order independent or partially ordered units.

i.e. chunks of work can happen independently of each other.

Where possible, we also like them to happen at the same time, or to at least look like they happen at the same time.

Main Loop

Multiprocessing

Just run a bunch of blocking servers, and have something else distribute and balance the load to them.

Load Balancer

Blocking Server

CONCURRENCY! WINNING!

Main Loop

Multiprocessing

Just run a bunch of blocking servers, and have something else distribute and balance the load to them.

Pros

Cons

Simple to implement.
Performance can still be quite good.

Managing processes can be complex.
Limited sharing of resources can be expensive.

Main Loop

--- kiss_slow.rb        2016-05-01 16:08:31.422044736 -0400
+++ kiss_multiprocessing.rb     2016-05-01 16:45:13.238012734 -0400
@@ -11,6 +11,8 @@
 def run( host = '0.0.0.0', port = '8080' )
   server = TCPServer.new( host, port )

+  fork_it
+
   while connection = server.accept
     request = get_request connection
     response = handle request
@@ -20,6 +22,18 @@
   end
 end

+def fork_it( process_count = 9 )
+  pid = nil
+  process_count.times do
+    if pid = fork
+      Process.detach( pid )
+    else
+      break
+    end
+  end
+
+end
+
 def get_request connection
   r = ''
   while line = connection.gets

Listen on a port, then fork.

A child processes share opened ports. OS load balances.

YMMV depending on OS.

Multiprocessing Simple Blocking Server

Main Loop

Multiprocessing Simple Blocking Server

scrawls --ioengine multiprocess --processes 8

Single threaded blocking IO, across 8 processes

ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt

Document Path:          /test.txt
Document Length:        1078 bytes

Concurrency Level:      100
Time taken for tests:   5.463 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      126000000 bytes
HTML transferred:       107800000 bytes
Requests per second:    18305.76 [#/sec] (mean)
Time per request:       5.463 [ms] (mean)
Time per request:       0.055 [ms] (mean, across all concurrent requests)
Transfer rate:          22524.67 [Kbytes/sec] received

Main Loop

Multiprocessing Simple Blocking Server

scrawls --ioengine multiprocess --processes 8 --app slow.rb

Single threaded blocking IO, across 8 processes, slow app

ab -n 80 -c 8 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      8
Time taken for tests:   10.017 seconds
Complete requests:      80
Failed requests:        0
Non-2xx responses:      80
Total transferred:      3520 bytes
HTML transferred:       1280 bytes
Requests per second:    7.99 [#/sec] (mean)
Time per request:       1001.689 [ms] (mean)
Time per request:       125.211 [ms] (mean, across all concurrent requests)
Transfer rate:          0.34 [Kbytes/sec] received

Main Loop

Multiprocessing Simple Blocking Server

scrawls --ioengine multiprocess --processes 32 --app slow.rb

Single threaded blocking IO, across 8 processes, slow app

ab -n 320-c 32 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      32
Time taken for tests:   10.042 seconds
Complete requests:      320
Failed requests:        0
Non-2xx responses:      320
Total transferred:      14080 bytes
HTML transferred:       5120 bytes
Requests per second:    31.86 [#/sec] (mean)
Time per request:       1004.248 [ms] (mean)
Time per request:       31.383 [ms] (mean, across all concurrent requests)
Transfer rate:          1.37 [Kbytes/sec] received

Main Loop

Multiprocessing Simple Blocking Server

Just like when shopping; more lines == get done more quickly

Main Loop

Multiprocessing Simple Blocking Server

But....it uses a lot of resources:

root     14101  0.1  0.1 2162040 20124 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14103  0.0  0.1  66256 16460 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14107  0.0  0.1 133848 16492 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14111  0.0  0.1 201440 16516 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14115  0.0  0.1 269032 16548 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14119  0.0  0.1 336748 16560 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14123  0.0  0.1 404348 16584 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14127  0.0  0.1 471944 16608 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14131  0.0  0.1 539544 16672 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14135  0.0  0.1 607136 16664 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14139  0.0  0.1 609192 16724 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14143  0.0  0.1 742320 16724 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14147  0.0  0.1 809912 16756 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14151  0.0  0.1 877504 16784 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14155  0.0  0.1 945228 16812 pts/1    Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14159  0.0  0.1 1012824 16844 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14163  0.0  0.1 1080420 16788 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14167  0.0  0.1 1148012 16876 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14171  0.0  0.1 1215604 16932 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14175  0.0  0.1 1283196 17084 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14179  0.0  0.1 1350788 16988 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14183  0.0  0.1 1418380 16984 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14187  0.0  0.1 1485972 17044 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14191  0.0  0.1 1553564 17040 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14195  0.0  0.1 1621156 17068 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14199  0.0  0.1 1688748 17096 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14203  0.0  0.1 1756340 17124 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14207  0.0  0.1 1824080 17184 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14211  0.0  0.1 1891672 17180 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14215  0.0  0.1 1959264 17208 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14219  0.0  0.1 2026856 17236 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root     14223  0.0  0.1 2094448 17296 pts/1   Sl+  17:45   0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb

Main Loop

Multiprocessing Simple Blocking Server

With slow requests, multiprocessing with blocking servers still often feels like this.

Main Loop

Multiprocessing Simple Blocking Server

For most requests, each process still spends most of it's time waiting on something else to happen.

This can use a lot of RAM while still not being able to handle many slow requests at once because each slow request still blocks an entire process.

Fortunately.....

Main Loop

Ruby pre 2.0 was very copy-on-write unfriendly.

Multiprocessing consumed large amounts of RAM.

Modern Rubies are more resource friendly when forking.

 \_ ruby bin/scrawls --http 2162040 20124
     \_ ruby bin/scrawls --  66256 16460
     \_ ruby bin/scrawls -- 133848 16492
     \_ ruby bin/scrawls -- 201440 16516
     \_ ruby bin/scrawls -- 269032 16548
     \_ ruby bin/scrawls -- 336748 16560
     \_ ruby bin/scrawls -- 404348 16584
     \_ ruby bin/scrawls -- 471944 16608
     \_ ruby bin/scrawls -- 539544 16672
     \_ ruby bin/scrawls -- 607136 16664
     \_ ruby bin/scrawls -- 609192 16724
     \_ ruby bin/scrawls -- 742320 16724
     \_ ruby bin/scrawls -- 809912 16756
     \_ ruby bin/scrawls -- 877504 16784
     \_ ruby bin/scrawls -- 945228 16812
     \_ ruby bin/scrawls -- 1012824 16844
     \_ ruby bin/scrawls -- 1080420 16788
     \_ ruby bin/scrawls -- 1148012 16876
     \_ ruby bin/scrawls -- 1215604 16932
     \_ ruby bin/scrawls -- 1283196 17084
     \_ ruby bin/scrawls -- 1350788 16988
     \_ ruby bin/scrawls -- 1418380 16984
     \_ ruby bin/scrawls -- 1485972 17044
     \_ ruby bin/scrawls -- 1553564 17040
     \_ ruby bin/scrawls -- 1621156 17068
     \_ ruby bin/scrawls -- 1688748 17096
     \_ ruby bin/scrawls -- 1756340 17124
     \_ ruby bin/scrawls -- 1824080 17184
     \_ ruby bin/scrawls -- 1891672 17180
     \_ ruby bin/scrawls -- 1959264 17208
     \_ ruby bin/scrawls -- 2026856 17236
     \_ ruby bin/scrawls -- 2094448 17296

Multiprocessing

Simple Blocking Server

Copy On Write

The Abbreviated Version

When a process is forked, the OS keeps an array of shared resources between the parent and the child.

If the child doesn't change anything, those shared resources don't have to be duplicated.

Pre-2.x MRI Rubies touched all objects during the mark phase of garbage collection, forcing the OS to make private copies. Forking was very expensive, as a result. Modern MRI Rubies behave better.

Main Loop

Multithreading

A thread is the smallest sequence of instructions that can be managed independently by the scheduler. Multiple threads will share one process's memory.

Pros

Cons

Threading implementations vary a lot across Ruby implementations.
Locking issues on shared resources can be complicated. i.e. Threads are difficult and it's easy to make mistakes!

Easier to manage/load-balance in a single piece of software.
Threads are lightweight, so resource usage is generally better.
Can be very performant.

Main Loop

Multithreaded Server

Programming with threads can easily be a talk all by itself. A few quick guides and tutorials:

Main Loop

Multithreaded Server

--- server_slow.rb      2016-05-01 16:08:31.422044736 -0400
+++ server_multithreaded.rb     2016-05-01 21:56:47.997815573 -0400
@@ -11,12 +11,14 @@
 def run( host = '0.0.0.0', port = '8080' )
   server = TCPServer.new( host, port )

-  while connection = server.accept
-    request = get_request connection
-    response = handle request
+  while con = server.accept
+    Thread.new( con ) do |connection|
+      request = get_request connection
+      response = handle request

-    connection.write response
-    connection.close
+      connection.write response
+      connection.close
+    end
   end
 end

Simple, naive implementation - a new thread for every request, and assume everything else just works with this.

Main Loop

Multithreaded Server

scrawls --ioengine multithread

Multithreaded IO, single process

ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt

Document Path:          /test.txt
Document Length:        1078 bytes

Concurrency Level:      100
Time taken for tests:   33.832 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      126000000 bytes
HTML transferred:       107800000 bytes
Requests per second:    2955.78 [#/sec] (mean)
Time per request:       33.832 [ms] (mean)
Time per request:       0.338 [ms] (mean, across all concurrent requests)
Transfer rate:          3637.00 [Kbytes/sec] received

Main Loop

Multithreaded Server

scrawls --ioengine multithread

Multithreaded IO, single process

ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt

Requests per second: 2955.78

For fast requests, the overhead of a thread per request hurts speed on Ruby 2.3.1.

A more sophisticated implementation would use a fixed size pool of threads. The scrawls-ioengine-multithread gem doesn't support thread pools yet, though.

Main Loop

Multithreaded Server

scrawls --ioengine multithread --app slow.rb

Multithreaded IO, single process, slow app

ab -n 1000 -c 100 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      100
Time taken for tests:   10.073 seconds
Complete requests:      1000
Failed requests:        0
Non-2xx responses:      1000
Total transferred:      44000 bytes
HTML transferred:       16000 bytes
Requests per second:    99.28 [#/sec] (mean)
Time per request:       1007.302 [ms] (mean)
Time per request:       10.073 [ms] (mean, across all concurrent requests)
Transfer rate:          4.27 [Kbytes/sec] received

Main Loop

Multithreaded Server

Concurrent Requests	Requests per Second
10	9.98
20	19.95
50	49.79
100	99.28
200	196.65
1000	778.03

Slow requests scale pretty well with threads. Diminishing returns when thread count gets high when there is no thread pool, but it's not bad for such a trivial implementation.

Main Loop

Multithreaded Server

scrawls --ioengine multithread --processes 8

Multithreaded IO, multiple processes

ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt

Document Path:          /test.txt
Document Length:        1078 bytes

Concurrency Level:      100
Time taken for tests:   10.495 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      126000000 bytes
HTML transferred:       107800000 bytes
Requests per second:    9528.76 [#/sec] (mean)
Time per request:       10.495 [ms] (mean)
Time per request:       0.105 [ms] (mean, across all concurrent requests)
Transfer rate:          11724.84 [Kbytes/sec] received

Main Loop

Multithreaded Server

scrawls --ioengine multithread --processes 8

Multithreaded IO, multiple processes

ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt

Spreading requests across multiple processes, that are each multithreading, mitigates some of the losses from the Global Interpreter Lock in a single process on Ruby 2.3.1

Main Loop

Multithreaded Server

scrawls --ioengine multithread --processes 8 --app slow.rb

Multithreaded IO, multiple processes, slow app

ab -n 10000-c 100 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      1000
Time taken for tests:   11.199 seconds
Complete requests:      10000
Failed requests:        0
Non-2xx responses:      10000
Total transferred:      440000 bytes
HTML transferred:       160000 bytes
Requests per second:    892.95 [#/sec] (mean)
Time per request:       1119.888 [ms] (mean)
Time per request:       1.120 [ms] (mean, across all concurrent requests)
Transfer rate:          38.37 [Kbytes/sec] received

Main Loop

Event Driven Server

"Event Driven" is a vague label, encompassing numerous patterns and feature sets. One of the most common of these patterns is the Reactor pattern.

The Reactor pattern describes a system that handles asynchronous events, but that does so with synchronous event callbacks.

Main Loop

Event Driven Server

Client/Server interactions are often slow, but most of that time is spent waiting on latencies. CPUs are fast. The rest of the world is pretty slow.

Main Loop

Event Driven Server

An event reactor just spins in a loop, waiting for something to happen - such as a network connection, or data to read or to write.

When it does, an event is triggered to deal with it.

Events block the reactor.

Main Loop

Event Driven Server

Pros

Cons

Slow callbacks block the reactor.
Callback structured often used with evented programming can be confusing and difficult to test and debug.

Can be very fast and resource friendly.
With an appropriate underlying event notification facility, can scale to thousands of simultaneous connections.

Main Loop

Event Driven Server

Like Threading, this could easily be a talk all by itself. A few resources for further reading:

Main Loop

Event Driven Server

Many ways to do it, including EventMachine, Celluloid.io, or even a simple pure ruby event framework (SimpleReactor).

Currently, there is a simple event-based IO Engine for Scrawls, which is based on SimpleReactor, a pure Ruby Reactor implementation.

Main Loop

Event Driven Server

scrawls --ioengine simplereactor

Evented IO, single processes

ab -n 10000-c 100 http://127.0.0.1:8080/test.txt

Document Path:          /test.txt
Document Length:        1078 bytes

Concurrency Level:      100
Time taken for tests:   23.264 seconds
Complete requests:      100000
Failed requests:        0
Total transferred:      126000000 bytes
HTML transferred:       107800000 bytes
Requests per second:    4298.41 [#/sec] (mean)
Time per request:       23.264 [ms] (mean)
Time per request:       0.233 [ms] (mean, across all concurrent requests)
Transfer rate:          5289.06 [Kbytes/sec] received

Main Loop

Event Driven Server

scrawls --ioengine simplereactor --app slow.rb

Evented IO, single processes, slow app

ab -n 20 -c 1 http://127.0.0.1:8080/slowtest.txt

Document Path:          /slowtest.txt
Document Length:        16 bytes

Concurrency Level:      10
Time taken for tests:   20.017 seconds
Complete requests:      20
Failed requests:        0
Non-2xx responses:      20
Total transferred:      880 bytes
HTML transferred:       320 bytes
Requests per second:    1.00 [#/sec] (mean)
Time per request:       10008.391 [ms] (mean)
Time per request:       1000.839 [ms] (mean, across all concurrent requests)
Transfer rate:          0.04 [Kbytes/sec] received

Main Loop

Event Driven Server

It looks a lot like a blocking single threaded server.

However, where events shine is in dealing with high communications latencies.

They don't block on IO, so an efficient reactor can service large numbers of high latency connections efficiently.

Main Loop

Event Driven Server

Evented IO handling keeps things happy at high concurrencies, even across the country. For example, this is from an Engine Yard AWS Oregon instance talking to a non-AWS VM on the east coast.

Document Path:          /server.rb
Document Length:        1844 bytes

Concurrency Level:      1000
Time taken for tests:   3.394 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      19907832 bytes
HTML transferred:       18465816 bytes
Requests per second:    2946.48 [#/sec] (mean)
Time per request:       339.387 [ms] (mean)
Time per request:       0.339 [ms] (mean, across all concurrent requests)
Transfer rate:          5728.33 [Kbytes/sec] received

Main Loop

Event Driven Server

Concurrent Requests	Requests per Second
10	3888
50	3814
250	3651
1000	3609
2000	3261
10000	2740

And if your event notification framework is up to it, concurrencies can scale nicely.

Fast Responses

Main Loop

Event Driven Server - Hybridization!

Main Loop

Event Driven Server - Hybridization!

Evented IO can also mix well with threading, letting the slow stuff be slow, while not blocking the reactor from spinning and servicing things that are ready to be serviced.

Main Loop

Event Driven Server

Concurrent Requests	Requests per Second
10	8
50	30
250	181
500	340
1000	511
2000	854
10000	713

Slow (1 second delayed) Responses

Main Loop

Event Driven Server

Reactor/Event IO combined with threading is a great combination, if you are willing to deal with the complexity of implementation.

Even pure Ruby + Ruby 2.3 is pretty fast

Parsing HTTP - A Quick Note

The very simple examples so far have cheated with this.

HTTPRecognizer is being used to handle HTTP. It uses regular expressions. Going too far down that path will drive you to madness and sorrow.

Parsing HTTP

You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. so many times but it is not getting to me. Even enhanced irregular regular expressions as used by Perl are not up to the task of parsing HTML. You will never make me crack. HTML is a language of sufficient complexity that it cannot be parsed by regular expressions. Even Jon Skeet cannot parse HTML using regular expressions. Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide. The <center> cannot hold it is too late. The force of regex and HTML together in the same conceptual space will destroy your mind like so much watery putty. If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes. HTML-plus-regexp will liquify the nerves of the sentient whilst you observe, your psyche withering in the onslaught of horror. Rege̿̔̉x-based HTML parsers are the cancer that is killing StackOverflow it is too late it is too late we cannot be saved the trangession of a chi͡ld ensures regex will consume all living tissue (except for HTML which it cannot, as previously prophesied) dear lord help us how can anyone survive this scourge using regex to parse HTML has doomed humanity to an eternity of dread torture and security holes using regex as a tool to process HTML establishes a breach between this world and the dread realm of c͒ͪo͛ͫrrupt entities (like SGML entities, but more corrupt) a mere glimpse of the world of regex parsers for HTML will instantly transport a programmer's consciousness into a world of ceaseless screaming, he comes~~, the pestilent sl~~ithy regex-infection will devour your HTML parser, application and existence for all time like Visual Basic only worse he comes he comes do not fight he com̡e̶s, ̕h̵is un̨ho͞ly radiańcé destro҉ying all enli̍̈́̂̈́ghtenment, HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo͟ur eye͢s̸ ̛l̕ik͏e liquid pain, the song of re̸gular expre~~ssion parsing~~ will extinguish the voices of mortal man from the sphere I can see it can you see ̲͚̖͔̙î̩́t̲͎̩̱͔́̋̀ it is beautiful the final snuffing of the lies of Man ALL IS LOŚ͖̩͇̗̪̏̈́T ALL IS LOST the pon̷y he comes he c̶̮om~~es he co~~mes the ichor permeates all MY FACE MY FACE ᵒh god no NO NOO̼OO NΘ stop the an*̶͑̾̾̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e not rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ

With Regular Expressions

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

Parsing HTTP

With Regular Expressions

For limited, specific pieces of information, you can get away with it. People may hate you, but you can do it. Future sanity is not guaranteed, however.

HTTP is so complicated that this approach is madness for most purposes. Use a real parser.

Scrawls Doesn't Have an HTTP Engine for a real parser, yet. Sorry.

TL;DR

Concurrency Model	Benefits and Drawbacks
Blocking Single Threaded	++ simple to implement -- slow CPU or IO blocks the server -- potentially underutilizes resources
Event Driven	++ reactor doesn't block on IO ++ event framework like EventMachine can permit very high concurrent connection counts -- CPU bound tasks block the reactor, leading to results similar to simple blocking single threaded server.
Multiprocess	++ easy to fork to expand capacity ++ with sufficient resources, can be fast enough, even for slow, blocking tasks -- increased RAM footprint for multiple processes -- can be a hassle to manage pools of processes
Multithreaded	++ easy to spawn new threads ++ can be fast, particularly for slow actions -- potential for performance losses for fast requests -- with complex actions, threading can be difficult to get right when shared resources are involved

Ruby Web Servers

A Quick Survey

WEBrick
Mongrel
Thin
Puma
Passenger
Unicorn
ServerEngine
Rainbows!
Yahns
Goliath
Swiftiply

WEBrick

• Pure Ruby

• Thread based design

• Written in 2000 by Masayoshi Takahashi and Yuuzou Gotou.

• Ubiquitous, as it is bundled with Ruby itself.

• Flexible, fairly featureful, and easy to use.

• Fairly well documented.

• Fairly slow, but maybe fast enough...

ab -n 10000 -c 4 http://127.0.0.1:8080/test.txt

Requests per second: 450.30 [#/sec] (mean)

Mongrel

• Zed Shaw, 2006

• Ruby plus a C extension for parsing HTTP, built with Ragel

• Thread based

• Moderately fast for it's age

• EOL at 1.1.5 in a version that doesn't work with modern Rubies

• There is a 1.2.x version (gem install --pre mongrel) that does work with modern Rubies

• Completely unmaintained, but it is interesting code to look at and learn from. It inspired several other Ruby web servers.

Swiftiply

• My baby. May 2007

• Event Driven via EventMachine

• Original used regex for HTTP parsing....

• Structurally has support for real parsing, but needs work

• Intended to be a load balancing reverse proxy with a twist, with light web serving capabilities (static files, mostly)

• Very fast

• AFAIK, at least 100 production sites still use it

• I take pull requests. :)

Thin

• Marc-André Cournoyer, 2008

• Mongrel HTTP Parser

• Event Driven via EventMachine

• Rack interface

• Pretty fast. Still pretty commonly used

Goliath

• PostRank Labs, 2011

• Event Driven via EventMachine

• Completely asynchronous design, leveraging Ruby fibers to unwind callback complexity

• Performance and resource focused

• Niche usage; interesting use of Fibers

Puma

• Evan Phoenix, 2011

• Built on the bones of Mongrel

• Built from the ground up with concurrency in mind

-- This means threaded with a thread pool

• Rack

• Runs on all major Ruby implementations (MRI, JRuby, Rubinius)

• If you use Thin, take a look at Puma

Passenger

• Phusion Passenger

• Heavy Rails world usage

• Directly integrates into Apache or Nginx

• New version includes a fast purpose built web server, Raptor

• Rack

• Commercial version has more multithreading/concurrency options than the single open source version does.

-- Open source is single threaded multiprocess only

-- Commercial version can be configured for multithreaded multiprocess concurrency.

Unicorn

• Built on Mongrel's bones by Eric Wong circa 2009

• fork() (i.e. built in multiprocessing) oriented

• Not the fastest, but it deals well with slow/variable requests

• Very mature, and very heavy utilization in the Rails world

• With modern copy-on-write friendly Rubies, can have reasonably nice resource utilization

Rainbows!

• Unicorn specifically tuned for those very big, very slow requests

Yahns

• Another in the Unicorn family

• Tuned for apps that receive very little traffic

• When idle, it is truly idle

• Very sensitive to app failures

ServerEngine

• Everything said about Unicorn is basically applicable here

• Works on Windows and JRuby, whereas Unicorn does not

• Multiprocess design

• Ritta Narita (@narittan) gave a nice talk about it yesterday

http://rubykaigi.org/2016/presentations/narittan.html

Thank You!

Let's build some stuff. Ask me questions. Tell me what you want to know. If you ask me a question, you get an Engine Yard T-Shirt!

Thanks to Engine Yard

Your Trusted DevOps.

Over 1 Billion AWS Hours.

With Engine Yard Cloud, do what you do best—developing Ruby on Rails, Node.js, or PHP apps—while we do what we do best—ensuring your environment runs smoothly. You can be as hands on or hands off with AWS as you want.

Start developing your apps on our AWS account or yours for free today.

We’ve deployed it all....

http://engineyard.com

Web Server Concurrency

By wyhaines

Web Server Concurrency

RubyKaigi 2016 Presentation on web server concurrency.

1,114