Kirk Haines
khaines@engineyard.com
RubyKaigi 2016
Web Server Concurrency
https://github.com/engineyard/rubykaigi2016-concurrency
Kirk Haines
• Rubyist since 2001
• First professional Ruby web app in 2002
• Dozens of sites and apps, and several web servers since
• Engine Yard since 2008
• Former Ruby 1.8.6 maintainer (EOL announced at RubyKaigi 2011!!!)
khaines@engineyard.com
@wyhaines
What Is a Web Server?
A web server is an information technology that processes requests via HTTP, the basic network protocol used to distribute information on the World Wide Web.
https://en.wikipedia.org/wiki/Web_server
A web server is just a server that accepts HTTP requests, and returns HTTP responses.
"A Shell Server for HTTP"
(slightly modified from the original)
http://info.cern.ch/hypertext/WWW/Provider/ShellScript.html
#! /bin/sh
read get docid junk
cat `echo "$docid"` | \
ruby -p -e '$_.gsub(/^\/ [\r\n]/,"").chomp'
That's Terrifying
i.e. Don't Actually Do This
(super_simple.sh)
A (scary, limited, blocking, single threaded) web server!
#! /bin/sh
read get docid junk
cat `echo "$docid" | \
ruby -p -e '$_.gsub!(/^\/ [\r\n]/,"").chomp'`
netcat -l -p 5000 -e ./super_simple.sh
If you know the perl language, then that is a powerful (if otherwise incomprehensible) language with which to hack together a server.
-- http://info.cern.ch/hypertext/WWW/Provider/ShellScript.html
And if that isn't endorsement enough....
Web Server Architecture
There is nothing special about web server architecture.
It is just server architecture.
A web server is nothing more than a server that receives HTTP requests and returns HTTP responses, typically through standard socket communications.
require 'socket'
def run
simple_server = TCPServer.new("0.0.0.0", 8080)
loop do
connection = simple_server.accept
handle connection
end
end
def get_request connection
connection.gets
end
def process request
"OK"
end
def handle connection
request = get_request connection
response = process request
connection.puts response
connection.close
end
run
Basic TCP Server
Short Tangent
Simplest Ruby Web Server....
Short Tangent
Simplest Ruby Web Server....
ruby -run -e httpd -- -p 8080 .
Short Tangent
Simplest Ruby Web Server....
ruby -run -e httpd -- -p 8080 .
OK....That's kind of cheating.
# frozen_string_literal: false
#
# = un.rb
#
# Copyright (c) 2003 WATANABE Hirofumi <eban@ruby-lang.org>
#
# This program is free software.
# You can distribute/modify this program under the same terms of Ruby.
#
# == Utilities to replace common UNIX commands in Makefiles etc
#
# == SYNOPSIS
#
# ruby -run -e cp -- [OPTION] SOURCE DEST
# ruby -run -e ln -- [OPTION] TARGET LINK_NAME
# ruby -run -e mv -- [OPTION] SOURCE DEST
# ruby -run -e rm -- [OPTION] FILE
# ruby -run -e mkdir -- [OPTION] DIRS
# ruby -run -e rmdir -- [OPTION] DIRS
# ruby -run -e install -- [OPTION] SOURCE DEST
# ruby -run -e chmod -- [OPTION] OCTAL-MODE FILE
# ruby -run -e touch -- [OPTION] FILE
# ruby -run -e wait_writable -- [OPTION] FILE
# ruby -run -e mkmf -- [OPTION] EXTNAME [OPTION]
# ruby -run -e httpd -- [OPTION] DocumentRoot
# ruby -run -e help [COMMAND]
Short Tangent
def httpd
setup("", "BindAddress=ADDR", "Port=PORT", "MaxClients=NUM", "TempDir=DIR",
"DoNotReverseLookup", "RequestTimeout=SECOND", "HTTPVersion=VERSION") do
|argv, options|
require 'webrick'
opt = options[:RequestTimeout] and options[:RequestTimeout] = opt.to_i
[:Port, :MaxClients].each do |name|
opt = options[name] and (options[name] = Integer(opt)) rescue nil
end
options[:Port] ||= 8080 # HTTP Alternate
options[:DocumentRoot] = argv.shift || '.'
s = WEBrick::HTTPServer.new(options)
shut = proc {s.shutdown}
siglist = %w"TERM QUIT"
siglist.concat(%w"HUP INT") if STDIN.tty?
siglist &= Signal.list.keys
siglist.each do |sig|
Signal.trap(sig, shut)
end
s.start
end
end
OK.... Server Architecture
• Setup
• Command line arguments
• Configuration
• Listen on socket(s)
• Enter main loop
• Accept socket connection
• Handle connection
Listen on Sockets
AKA Network Communications
A server needs a way to receive requests and to return responses.
The two most common options:
- Native Ruby Networking Support (TCPServer and friends)
- EventMachine
Listen on Sockets
Native Ruby Networking Support
Ruby has a rich set of networking libraries, making it easy to write TCP clients and servers.
require 'socket'
class SimpleServer < TCPServer
def initialize( address, port )
super( address, port )
end
def run( address = '127.0.0.1', port = 80 )
server = TCPServer.new(address, port
loop do
socket = server.accept
handle_request( socket.gets )
end
end
def handle_request( req )
# Do Stuff with req
end
end
A Simple Ruby Web Server
Scrawls
# gem install scrawls
( Scrawls is currently woefully incomplete; send me PRs! )
http://github.com/wyhaines/scrawls
Scrawls
Pluggable IO engines
Pluggable HTTP parsing
Shared core makes it easy to use to look at the impact of different concurrency options -- only the concurrency implementation changes between runs.
Scrawls
scrawls [OPTIONS]
scrawls is a simple ruby web server.
-h, --help:
Show this help.
-d DIR, --docroot DIR:
Provide a specific directory for the docroot for this server.
-a APP, --app APP:
Ruby file containing a rack app to use.
-i IO_ENGINE, --ioengine IO_ENGINE:
Tell the webserver which concurrency engine to use.
Installed IO Engines:
multiprocess
multithread
simplereactor
single
-h HTTP_ENGINE, --httpengine HTTP_ENGINE:
Tell the webserver which concurrency engine to use.
Installed HTTP Engines:
httprecognizer
-p PORT, --port PORT:
The port for the web server to listen on. If this flag is not used, the web
server defaults to port 80.
-b HOSTNAME, --bind HOSTNAME:
The hostname/IP to bind to. This defaults to 127.0.0.1 if it is not provided.
Scrawls
Pluggable IO engines
- Single Threaded
- Multiprocess
- Multithreaded (+ Multiprocess)
- Evented (pure Ruby implementation)
Main Loop
i.e. Concurrency Options
How a server handles concurrency is fundamental to it's design.
require 'socket'
class SimpleServer < TCPServer
def initialize( address, port )
super( address, port )
end
def run( address = '127.0.0.1', port = 80 )
server = TCPServer.new(address, port
loop do
socket = server.accept
handle_request( socket.gets )
end
end
def handle_request( req )
# Do Stuff with req
end
end
The simplest approach is this:
- Single threaded
- Blocking -- server waits until a request is handled before accepting and handling a new one
Let's Try Some Things
Test server is an 8 core VM running Ubuntu 16.04
All examples ran with Ruby 2.3.1
Apache Bench (ab) used as a simple benchmarking tool
Single Threaded Example
scrawls --ioengine single
Single threaded, blocking IO
ab -n 100000 -c 1 http://127.0.0.1:8080/test.txt
Document Path: /test.txt
Document Length: 1078 bytes
Concurrency Level: 1
Time taken for tests: 27.673 seconds
Complete requests: 100000
Failed requests: 0
Total transferred: 126000000 bytes
HTML transferred: 107800000 bytes
Requests per second: 3613.62 [#/sec] (mean)
Time per request: 0.277 [ms] (mean)
Time per request: 0.277 [ms] (mean, across all concurrent requests)
Transfer rate: 4446.44 [Kbytes/sec] received
Single Threaded Example
scrawls --ioengine single
Single threaded, blocking IO
3614 request/second?
That's not too bad. Right?
Single Threaded Example
scrawls --ioengine single
Single threaded, blocking IO
The devil is in the details.
- Fast response (a small text file)
- Little network latency
Single Threaded Example
The real world is slow, though.
- Query databases
- Interact with microservices
- Generate content
- Assemble it all
- Return it over slow networks
This all takes time.
Web servers spend a lot of time waiting on other things.
Single Threaded Example
scrawls --ioengine single --app slow.rb
Single threaded blocking IO, with app that takes one second to generate a response
ab -n 20 -c 1 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 1
Time taken for tests: 20.016 seconds
Complete requests: 20
Failed requests: 0
Non-2xx responses: 20
Total transferred: 880 bytes
HTML transferred: 320 bytes
Requests per second: 1.00 [#/sec] (mean)
Time per request: 1000.817 [ms] (mean)
Time per request: 1000.817 [ms] (mean, across all concurrent requests)
Transfer rate: 0.04 [Kbytes/sec] received
Main Loop
Blocking Single Threaded Server
How To Address This?
Concurrency!
Multiprocessing
Multithreading
Event Based
How To Address This?
Concurrency!
Concurrency: decomposability of a problem into order independent or partially ordered units.
i.e. chunks of work can happen independently of each other.
Where possible, we also like them to happen at the same time, or to at least look like they happen at the same time.
Main Loop
Multiprocessing
Just run a bunch of blocking servers, and have something else distribute and balance the load to them.
Load Balancer
Blocking Server
Blocking Server
Blocking Server
CONCURRENCY! WINNING!
Main Loop
Multiprocessing
Just run a bunch of blocking servers, and have something else distribute and balance the load to them.
Pros
Cons
- Simple to implement.
- Performance can still be quite good.
- Managing processes can be complex.
- Limited sharing of resources can be expensive.
Main Loop
--- kiss_slow.rb 2016-05-01 16:08:31.422044736 -0400
+++ kiss_multiprocessing.rb 2016-05-01 16:45:13.238012734 -0400
@@ -11,6 +11,8 @@
def run( host = '0.0.0.0', port = '8080' )
server = TCPServer.new( host, port )
+ fork_it
+
while connection = server.accept
request = get_request connection
response = handle request
@@ -20,6 +22,18 @@
end
end
+def fork_it( process_count = 9 )
+ pid = nil
+ process_count.times do
+ if pid = fork
+ Process.detach( pid )
+ else
+ break
+ end
+ end
+
+end
+
def get_request connection
r = ''
while line = connection.gets
Listen on a port, then fork.
A child processes share opened ports. OS load balances.
YMMV depending on OS.
Multiprocessing Simple Blocking Server
Main Loop
Multiprocessing Simple Blocking Server
scrawls --ioengine multiprocess --processes 8
Single threaded blocking IO, across 8 processes
ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt
Document Path: /test.txt
Document Length: 1078 bytes
Concurrency Level: 100
Time taken for tests: 5.463 seconds
Complete requests: 100000
Failed requests: 0
Total transferred: 126000000 bytes
HTML transferred: 107800000 bytes
Requests per second: 18305.76 [#/sec] (mean)
Time per request: 5.463 [ms] (mean)
Time per request: 0.055 [ms] (mean, across all concurrent requests)
Transfer rate: 22524.67 [Kbytes/sec] received
Main Loop
Multiprocessing Simple Blocking Server
scrawls --ioengine multiprocess --processes 8 --app slow.rb
Single threaded blocking IO, across 8 processes, slow app
ab -n 80 -c 8 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 8
Time taken for tests: 10.017 seconds
Complete requests: 80
Failed requests: 0
Non-2xx responses: 80
Total transferred: 3520 bytes
HTML transferred: 1280 bytes
Requests per second: 7.99 [#/sec] (mean)
Time per request: 1001.689 [ms] (mean)
Time per request: 125.211 [ms] (mean, across all concurrent requests)
Transfer rate: 0.34 [Kbytes/sec] received
Main Loop
Multiprocessing Simple Blocking Server
scrawls --ioengine multiprocess --processes 32 --app slow.rb
Single threaded blocking IO, across 8 processes, slow app
ab -n 320-c 32 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 32
Time taken for tests: 10.042 seconds
Complete requests: 320
Failed requests: 0
Non-2xx responses: 320
Total transferred: 14080 bytes
HTML transferred: 5120 bytes
Requests per second: 31.86 [#/sec] (mean)
Time per request: 1004.248 [ms] (mean)
Time per request: 31.383 [ms] (mean, across all concurrent requests)
Transfer rate: 1.37 [Kbytes/sec] received
Main Loop
Multiprocessing Simple Blocking Server
Just like when shopping; more lines == get done more quickly
Main Loop
Multiprocessing Simple Blocking Server
But....it uses a lot of resources:
root 14101 0.1 0.1 2162040 20124 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14103 0.0 0.1 66256 16460 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14107 0.0 0.1 133848 16492 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14111 0.0 0.1 201440 16516 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14115 0.0 0.1 269032 16548 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14119 0.0 0.1 336748 16560 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14123 0.0 0.1 404348 16584 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14127 0.0 0.1 471944 16608 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14131 0.0 0.1 539544 16672 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14135 0.0 0.1 607136 16664 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14139 0.0 0.1 609192 16724 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14143 0.0 0.1 742320 16724 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14147 0.0 0.1 809912 16756 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14151 0.0 0.1 877504 16784 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14155 0.0 0.1 945228 16812 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14159 0.0 0.1 1012824 16844 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14163 0.0 0.1 1080420 16788 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14167 0.0 0.1 1148012 16876 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14171 0.0 0.1 1215604 16932 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14175 0.0 0.1 1283196 17084 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14179 0.0 0.1 1350788 16988 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14183 0.0 0.1 1418380 16984 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14187 0.0 0.1 1485972 17044 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14191 0.0 0.1 1553564 17040 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14195 0.0 0.1 1621156 17068 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14199 0.0 0.1 1688748 17096 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14203 0.0 0.1 1756340 17124 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14207 0.0 0.1 1824080 17184 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14211 0.0 0.1 1891672 17180 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14215 0.0 0.1 1959264 17208 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14219 0.0 0.1 2026856 17236 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
root 14223 0.0 0.1 2094448 17296 pts/1 Sl+ 17:45 0:00 ruby bin/scrawls --httpengine=httprecognizer --ioengine=multiprocess --processes=32 --app=slow.rb
Main Loop
Multiprocessing Simple Blocking Server
With slow requests, multiprocessing with blocking servers still often feels like this.
Main Loop
Multiprocessing Simple Blocking Server
For most requests, each process still spends most of it's time waiting on something else to happen.
This can use a lot of RAM while still not being able to handle many slow requests at once because each slow request still blocks an entire process.
Fortunately.....
Main Loop
Ruby pre 2.0 was very copy-on-write unfriendly.
Multiprocessing consumed large amounts of RAM.
Modern Rubies are more resource friendly when forking.
\_ ruby bin/scrawls --http 2162040 20124
\_ ruby bin/scrawls -- 66256 16460
\_ ruby bin/scrawls -- 133848 16492
\_ ruby bin/scrawls -- 201440 16516
\_ ruby bin/scrawls -- 269032 16548
\_ ruby bin/scrawls -- 336748 16560
\_ ruby bin/scrawls -- 404348 16584
\_ ruby bin/scrawls -- 471944 16608
\_ ruby bin/scrawls -- 539544 16672
\_ ruby bin/scrawls -- 607136 16664
\_ ruby bin/scrawls -- 609192 16724
\_ ruby bin/scrawls -- 742320 16724
\_ ruby bin/scrawls -- 809912 16756
\_ ruby bin/scrawls -- 877504 16784
\_ ruby bin/scrawls -- 945228 16812
\_ ruby bin/scrawls -- 1012824 16844
\_ ruby bin/scrawls -- 1080420 16788
\_ ruby bin/scrawls -- 1148012 16876
\_ ruby bin/scrawls -- 1215604 16932
\_ ruby bin/scrawls -- 1283196 17084
\_ ruby bin/scrawls -- 1350788 16988
\_ ruby bin/scrawls -- 1418380 16984
\_ ruby bin/scrawls -- 1485972 17044
\_ ruby bin/scrawls -- 1553564 17040
\_ ruby bin/scrawls -- 1621156 17068
\_ ruby bin/scrawls -- 1688748 17096
\_ ruby bin/scrawls -- 1756340 17124
\_ ruby bin/scrawls -- 1824080 17184
\_ ruby bin/scrawls -- 1891672 17180
\_ ruby bin/scrawls -- 1959264 17208
\_ ruby bin/scrawls -- 2026856 17236
\_ ruby bin/scrawls -- 2094448 17296
Multiprocessing
Simple Blocking Server
Copy On Write
The Abbreviated Version
When a process is forked, the OS keeps an array of shared resources between the parent and the child.
If the child doesn't change anything, those shared resources don't have to be duplicated.
Pre-2.x MRI Rubies touched all objects during the mark phase of garbage collection, forcing the OS to make private copies. Forking was very expensive, as a result. Modern MRI Rubies behave better.
Main Loop
Multithreading
A thread is the smallest sequence of instructions that can be managed independently by the scheduler. Multiple threads will share one process's memory.
Pros
Cons
- Threading implementations vary a lot across Ruby implementations.
- Locking issues on shared resources can be complicated. i.e. Threads are difficult and it's easy to make mistakes!
- Easier to manage/load-balance in a single piece of software.
- Threads are lightweight, so resource usage is generally better.
- Can be very performant.
Main Loop
Multithreaded Server
Programming with threads can easily be a talk all by itself. A few quick guides and tutorials:
Main Loop
Multithreaded Server
--- server_slow.rb 2016-05-01 16:08:31.422044736 -0400
+++ server_multithreaded.rb 2016-05-01 21:56:47.997815573 -0400
@@ -11,12 +11,14 @@
def run( host = '0.0.0.0', port = '8080' )
server = TCPServer.new( host, port )
- while connection = server.accept
- request = get_request connection
- response = handle request
+ while con = server.accept
+ Thread.new( con ) do |connection|
+ request = get_request connection
+ response = handle request
- connection.write response
- connection.close
+ connection.write response
+ connection.close
+ end
end
end
Simple, naive implementation - a new thread for every request, and assume everything else just works with this.
Main Loop
Multithreaded Server
scrawls --ioengine multithread
Multithreaded IO, single process
ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt
Document Path: /test.txt
Document Length: 1078 bytes
Concurrency Level: 100
Time taken for tests: 33.832 seconds
Complete requests: 100000
Failed requests: 0
Total transferred: 126000000 bytes
HTML transferred: 107800000 bytes
Requests per second: 2955.78 [#/sec] (mean)
Time per request: 33.832 [ms] (mean)
Time per request: 0.338 [ms] (mean, across all concurrent requests)
Transfer rate: 3637.00 [Kbytes/sec] received
Main Loop
Multithreaded Server
scrawls --ioengine multithread
Multithreaded IO, single process
ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt
Requests per second: 2955.78
For fast requests, the overhead of a thread per request hurts speed on Ruby 2.3.1.
A more sophisticated implementation would use a fixed size pool of threads. The scrawls-ioengine-multithread gem doesn't support thread pools yet, though.
Main Loop
Multithreaded Server
scrawls --ioengine multithread --app slow.rb
Multithreaded IO, single process, slow app
ab -n 1000 -c 100 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 100
Time taken for tests: 10.073 seconds
Complete requests: 1000
Failed requests: 0
Non-2xx responses: 1000
Total transferred: 44000 bytes
HTML transferred: 16000 bytes
Requests per second: 99.28 [#/sec] (mean)
Time per request: 1007.302 [ms] (mean)
Time per request: 10.073 [ms] (mean, across all concurrent requests)
Transfer rate: 4.27 [Kbytes/sec] received
Main Loop
Multithreaded Server
Concurrent Requests | Requests per Second |
---|---|
10 | 9.98 |
20 | 19.95 |
50 | 49.79 |
100 | 99.28 |
200 | 196.65 |
1000 | 778.03 |
Slow requests scale pretty well with threads. Diminishing returns when thread count gets high when there is no thread pool, but it's not bad for such a trivial implementation.
Main Loop
Multithreaded Server
scrawls --ioengine multithread --processes 8
Multithreaded IO, multiple processes
ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt
Document Path: /test.txt
Document Length: 1078 bytes
Concurrency Level: 100
Time taken for tests: 10.495 seconds
Complete requests: 100000
Failed requests: 0
Total transferred: 126000000 bytes
HTML transferred: 107800000 bytes
Requests per second: 9528.76 [#/sec] (mean)
Time per request: 10.495 [ms] (mean)
Time per request: 0.105 [ms] (mean, across all concurrent requests)
Transfer rate: 11724.84 [Kbytes/sec] received
Main Loop
Multithreaded Server
scrawls --ioengine multithread --processes 8
Multithreaded IO, multiple processes
ab -n 100000 -c 100 http://127.0.0.1:8080/test.txt
Spreading requests across multiple processes, that are each multithreading, mitigates some of the losses from the Global Interpreter Lock in a single process on Ruby 2.3.1
Main Loop
Multithreaded Server
scrawls --ioengine multithread --processes 8 --app slow.rb
Multithreaded IO, multiple processes, slow app
ab -n 10000-c 100 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 1000
Time taken for tests: 11.199 seconds
Complete requests: 10000
Failed requests: 0
Non-2xx responses: 10000
Total transferred: 440000 bytes
HTML transferred: 160000 bytes
Requests per second: 892.95 [#/sec] (mean)
Time per request: 1119.888 [ms] (mean)
Time per request: 1.120 [ms] (mean, across all concurrent requests)
Transfer rate: 38.37 [Kbytes/sec] received
Main Loop
Event Driven Server
"Event Driven" is a vague label, encompassing numerous patterns and feature sets. One of the most common of these patterns is the Reactor pattern.
The Reactor pattern describes a system that handles asynchronous events, but that does so with synchronous event callbacks.
Main Loop
Event Driven Server
Client/Server interactions are often slow, but most of that time is spent waiting on latencies. CPUs are fast. The rest of the world is pretty slow.
Main Loop
Event Driven Server
An event reactor just spins in a loop, waiting for something to happen - such as a network connection, or data to read or to write.
When it does, an event is triggered to deal with it.
Events block the reactor.
Main Loop
Event Driven Server
Pros
Cons
- Slow callbacks block the reactor.
- Callback structured often used with evented programming can be confusing and difficult to test and debug.
- Can be very fast and resource friendly.
- With an appropriate underlying event notification facility, can scale to thousands of simultaneous connections.
Main Loop
Event Driven Server
Like Threading, this could easily be a talk all by itself. A few resources for further reading:
Main Loop
Event Driven Server
Many ways to do it, including EventMachine, Celluloid.io, or even a simple pure ruby event framework (SimpleReactor).
Currently, there is a simple event-based IO Engine for Scrawls, which is based on SimpleReactor, a pure Ruby Reactor implementation.
Main Loop
Event Driven Server
scrawls --ioengine simplereactor
Evented IO, single processes
ab -n 10000-c 100 http://127.0.0.1:8080/test.txt
Document Path: /test.txt
Document Length: 1078 bytes
Concurrency Level: 100
Time taken for tests: 23.264 seconds
Complete requests: 100000
Failed requests: 0
Total transferred: 126000000 bytes
HTML transferred: 107800000 bytes
Requests per second: 4298.41 [#/sec] (mean)
Time per request: 23.264 [ms] (mean)
Time per request: 0.233 [ms] (mean, across all concurrent requests)
Transfer rate: 5289.06 [Kbytes/sec] received
Main Loop
Event Driven Server
scrawls --ioengine simplereactor --app slow.rb
Evented IO, single processes, slow app
ab -n 20 -c 1 http://127.0.0.1:8080/slowtest.txt
Document Path: /slowtest.txt
Document Length: 16 bytes
Concurrency Level: 10
Time taken for tests: 20.017 seconds
Complete requests: 20
Failed requests: 0
Non-2xx responses: 20
Total transferred: 880 bytes
HTML transferred: 320 bytes
Requests per second: 1.00 [#/sec] (mean)
Time per request: 10008.391 [ms] (mean)
Time per request: 1000.839 [ms] (mean, across all concurrent requests)
Transfer rate: 0.04 [Kbytes/sec] received
Main Loop
Event Driven Server
It looks a lot like a blocking single threaded server.
However, where events shine is in dealing with high communications latencies.
They don't block on IO, so an efficient reactor can service large numbers of high latency connections efficiently.
Main Loop
Event Driven Server
Evented IO handling keeps things happy at high concurrencies, even across the country. For example, this is from an Engine Yard AWS Oregon instance talking to a non-AWS VM on the east coast.
Document Path: /server.rb
Document Length: 1844 bytes
Concurrency Level: 1000
Time taken for tests: 3.394 seconds
Complete requests: 10000
Failed requests: 0
Write errors: 0
Total transferred: 19907832 bytes
HTML transferred: 18465816 bytes
Requests per second: 2946.48 [#/sec] (mean)
Time per request: 339.387 [ms] (mean)
Time per request: 0.339 [ms] (mean, across all concurrent requests)
Transfer rate: 5728.33 [Kbytes/sec] received
Main Loop
Event Driven Server
Concurrent Requests | Requests per Second |
---|---|
10 | 3888 |
50 | 3814 |
250 | 3651 |
1000 | 3609 |
2000 | 3261 |
10000 | 2740 |
And if your event notification framework is up to it, concurrencies can scale nicely.
Fast Responses
Main Loop
Event Driven Server - Hybridization!
Main Loop
Event Driven Server - Hybridization!
Evented IO can also mix well with threading, letting the slow stuff be slow, while not blocking the reactor from spinning and servicing things that are ready to be serviced.
Main Loop
Event Driven Server
Concurrent Requests | Requests per Second |
---|---|
10 | 8 |
50 | 30 |
250 | 181 |
500 | 340 |
1000 | 511 |
2000 | 854 |
10000 | 713 |
Slow (1 second delayed) Responses
Main Loop
Event Driven Server
Reactor/Event IO combined with threading is a great combination, if you are willing to deal with the complexity of implementation.
Even pure Ruby + Ruby 2.3 is pretty fast
Parsing HTTP - A Quick Note
The very simple examples so far have cheated with this.
HTTPRecognizer is being used to handle HTTP. It uses regular expressions. Going too far down that path will drive you to madness and sorrow.
Parsing HTTP
You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. so many times but it is not getting to me. Even enhanced irregular regular expressions as used by Perl are not up to the task of parsing HTML. You will never make me crack. HTML is a language of sufficient complexity that it cannot be parsed by regular expressions. Even Jon Skeet cannot parse HTML using regular expressions. Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide. The <center> cannot hold it is too late. The force of regex and HTML together in the same conceptual space will destroy your mind like so much watery putty. If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes. HTML-plus-regexp will liquify the nerves of the sentient whilst you observe, your psyche withering in the onslaught of horror. Rege̿̔̉x-based HTML parsers are the cancer that is killing StackOverflow it is too late it is too late we cannot be saved the trangession of a chi͡ld ensures regex will consume all living tissue (except for HTML which it cannot, as previously prophesied) dear lord help us how can anyone survive this scourge using regex to parse HTML has doomed humanity to an eternity of dread torture and security holes using regex as a tool to process HTML establishes a breach between this world and the dread realm of c͒ͪo͛ͫrrupt entities (like SGML entities, but more corrupt) a mere glimpse of the world of regex parsers for HTML will instantly transport a programmer's consciousness into a world of ceaseless screaming, he comes, the pestilent slithy regex-infection will devour your HTML parser, application and existence for all time like Visual Basic only worse he comes he comes do not fight he com̡e̶s, ̕h̵is un̨ho͞ly radiańcé destro҉ying all enli̍̈́̂̈́ghtenment, HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo͟ur eye͢s̸ ̛l̕ik͏e liquid pain, the song of re̸gular expression parsing will extinguish the voices of mortal man from the sphere I can see it can you see ̲͚̖͔̙î̩́t̲͎̩̱͔́̋̀ it is beautiful the final snuf
fing of the lies of Man ALL IS LOŚ͖̩͇̗̪̏̈́T ALL IS LOST the pon̷y he comes he c̶̮omes he comes the ichor permeates all MY FACE MY FACE ᵒh god no NO NOO̼OO NΘ stop the an*̶͑̾̾̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e
not rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ
With Regular Expressions
Parsing HTTP
With Regular Expressions
For limited, specific pieces of information, you can get away with it. People may hate you, but you can do it. Future sanity is not guaranteed, however.
HTTP is so complicated that this approach is madness for most purposes. Use a real parser.
Scrawls Doesn't Have an HTTP Engine for a real parser, yet. Sorry.
TL;DR
Concurrency Model | Benefits and Drawbacks |
---|---|
Blocking Single Threaded | ++ simple to implement -- slow CPU or IO blocks the server -- potentially underutilizes resources |
Event Driven | ++ reactor doesn't block on IO ++ event framework like EventMachine can permit very high concurrent connection counts -- CPU bound tasks block the reactor, leading to results similar to simple blocking single threaded server. |
Multiprocess | ++ easy to fork to expand capacity ++ with sufficient resources, can be fast enough, even for slow, blocking tasks -- increased RAM footprint for multiple processes -- can be a hassle to manage pools of processes |
Multithreaded | ++ easy to spawn new threads ++ can be fast, particularly for slow actions -- potential for performance losses for fast requests -- with complex actions, threading can be difficult to get right when shared resources are involved |
Ruby Web Servers
A Quick Survey
- WEBrick
- Mongrel
- Thin
- Puma
- Passenger
- Unicorn
- ServerEngine
- Rainbows!
- Yahns
- Goliath
- Swiftiply
WEBrick
• Pure Ruby
• Thread based design
• Written in 2000 by Masayoshi Takahashi and Yuuzou Gotou.
• Ubiquitous, as it is bundled with Ruby itself.
• Flexible, fairly featureful, and easy to use.
• Fairly well documented.
• Fairly slow, but maybe fast enough...
ab -n 10000 -c 4 http://127.0.0.1:8080/test.txt
Requests per second: 450.30 [#/sec] (mean)
Mongrel
• Zed Shaw, 2006
• Ruby plus a C extension for parsing HTTP, built with Ragel
• Thread based
• Moderately fast for it's age
• EOL at 1.1.5 in a version that doesn't work with modern Rubies
• There is a 1.2.x version (gem install --pre mongrel) that does work with modern Rubies
• Completely unmaintained, but it is interesting code to look at and learn from. It inspired several other Ruby web servers.
Swiftiply
• My baby. May 2007
• Event Driven via EventMachine
• Original used regex for HTTP parsing....
• Structurally has support for real parsing, but needs work
• Intended to be a load balancing reverse proxy with a twist, with light web serving capabilities (static files, mostly)
• Very fast
• AFAIK, at least 100 production sites still use it
• I take pull requests. :)
Thin
• Marc-André Cournoyer, 2008
• Mongrel HTTP Parser
• Event Driven via EventMachine
• Rack interface
• Pretty fast. Still pretty commonly used
Goliath
• PostRank Labs, 2011
• Event Driven via EventMachine
• Completely asynchronous design, leveraging Ruby fibers to unwind callback complexity
• Performance and resource focused
• Niche usage; interesting use of Fibers
Puma
• Evan Phoenix, 2011
• Built on the bones of Mongrel
• Built from the ground up with concurrency in mind
-- This means threaded with a thread pool
• Rack
• Runs on all major Ruby implementations (MRI, JRuby, Rubinius)
• If you use Thin, take a look at Puma
Passenger
• Phusion Passenger
• Heavy Rails world usage
• Directly integrates into Apache or Nginx
• New version includes a fast purpose built web server, Raptor
• Rack
• Commercial version has more multithreading/concurrency options than the single open source version does.
-- Open source is single threaded multiprocess only
-- Commercial version can be configured for multithreaded multiprocess concurrency.
Unicorn
• Built on Mongrel's bones by Eric Wong circa 2009
• fork() (i.e. built in multiprocessing) oriented
• Not the fastest, but it deals well with slow/variable requests
• Very mature, and very heavy utilization in the Rails world
• With modern copy-on-write friendly Rubies, can have reasonably nice resource utilization
Rainbows!
• Unicorn specifically tuned for those very big, very slow requests
Yahns
• Another in the Unicorn family
• Tuned for apps that receive very little traffic
• When idle, it is truly idle
• Very sensitive to app failures
ServerEngine
• Everything said about Unicorn is basically applicable here
• Works on Windows and JRuby, whereas Unicorn does not
• Multiprocess design
• Ritta Narita (@narittan) gave a nice talk about it yesterday
http://rubykaigi.org/2016/presentations/narittan.html
Thank You!
Let's build some stuff. Ask me questions. Tell me what you want to know. If you ask me a question, you get an Engine Yard T-Shirt!
Thanks to Engine Yard
Your Trusted DevOps.
Over 1 Billion AWS Hours.
With Engine Yard Cloud, do what you do best—developing Ruby on Rails, Node.js, or PHP apps—while we do what we do best—ensuring your environment runs smoothly. You can be as hands on or hands off with AWS as you want.
Start developing your apps on our AWS account or yours for free today.
We’ve deployed it all....
Web Server Concurrency
By wyhaines
Web Server Concurrency
RubyKaigi 2016 Presentation on web server concurrency.
- 1,091