Richard Whaling
Spantree Technology Group
(or how to get things done without the JVM)
Twitter: @RichardWhaling
https://spantree.net/blog/
object Hello {
def main(args: Array[String]):Unit = {
println("Hello, CASE!")
}
}
This just works!
type Vec = CStruct3[Double, Double, Double]
val vec:Ptr[Vec] = stackalloc[Vec]
!vec._1 = 10.0 // initialize fields
!vec._2 = 20.0
!vec._3 = 30.0
length(vec) // pass by reference
@extern object stdlib {
def malloc(size: CSize): Ptr[Byte] = extern
def free(ptr: Ptr[Byte]): CInt = extern
}
val ptr = stdlib.malloc(32)
stdlib.free(ptr)
What can we do?
What can't we do?
A typical server:
The catch: it has to do all of these at the same time...
with (traditionally) blocking system calls.
socket() -- initializes a new socket and selects protocol bind() -- assigns an address and port to a socket listen() -- begins accepting incoming connections on a bound socket accept() -- takes an incoming connection off the OS backlog connect() -- initiates an outgoing connection on an unbound socket
read()/recv()/recvmsg() -- reads bytes from a connected socket
write()/send()/sendmsg() -- writes bytes to a connected socket
close() -- closes a connected socket
ioctl/setsocketopt()/fcntl() -- evil
socket()
bind()
listen()
accept()
read()
write()
close()
socket()
connect()
write()
read()
close()
Server
Client
def serve(port:UShort): Unit = {
// Allocate and initialize address struct
val addr_size = sizeof[sockaddr_in]
val server_address = malloc(addr_size).cast[Ptr[sockaddr_in]]
!server_address._1 = AF_INET.toUShort // IP Socket
!server_address._2 = htons(port) // port
!server_address._3._1 = INADDR_ANY // bind to 0.0.0.0
// Bind and listen on socket
val sock_fd = socket(AF_INET, SOCK_STREAM, 0) // SOCK_STREAM indicates TCP and not UDP
val bind_result = bind(sock_fd, server_address.cast[Ptr[sockaddr]], addr_size.toUInt)
println(s"bind returned $bind_result")
val listen_result = listen(sock_fd, 128)
println(s"listen returned $listen_result")
// Allocate and initialize client address struct
val client_address = malloc(addr_size).cast[Ptr[sockaddr_in]]
val client_addr_size = stackalloc[UInt]
!client_addr_size = addr_size.toUInt
// Main accept() loop
while (true) {
val conn_fd = accept(sock_fd, client_address.cast[Ptr[sockaddr]], client_addr_size)
println(s"accept returned fd $conn_fd")
handleConnection(conn_fd)
}
close(sock_fd)
}
def handleConnection(conn_fd:Int, max_size:Int = 1024): Unit = {
val line_buffer = malloc(max_size)
while (true) {
val read_result = read(conn_fd, line_buffer, max_size)
println(s"read $read_result bytes")
if (read_result == 0) // EOF
return
line_buffer(read_result) = 0 // Append a string-end marker
val write_result = write(conn_fd, line_buffer, read_result)
println(s"wrote $write_result bytes")
}
}
fork() clones a process in-place
one process calls, two return
parent-child relationship
parent is responsible for supervising the child
if a child exits, it stays as a "zombie" until "reaped"
socket()
bind()
listen()
accept()
read()
write()
close()
socket()
connect()
write()
read()
close()
Server
Client
fork()
def handleConnection(conn_fd:Int, max_size:Int = 1024): Unit = {
val pid = fork()
if (pid != 0) {
// In parent process
println("forked pid $pid to handle connection")
close(conn_fd)
return
} else {
// In child process
println("fork returned $pid, in child process")
val line_buffer = malloc(max_size)
while (true) {
val read_result = read(conn_fd, line_buffer, 1024)
println(s"read $read_result bytes")
if (read_result == 0) {
// Cleanup
close(conn_fd)
sys.exit()
}
line_buffer(read_result) = 0
val write_result = write(conn_fd, line_buffer, read_result)
println(s"wrote $write_result bytes")
}
}
}
Peter H. Salus,
from A Quarter Century of UNIX
Conjecture: HTTP is a solved problem.
What is the simplest way to use a stable HTTP server for a SN app?
How does it perform?
Actually a family of 6 very similar functions
Executes a brand-new program - cannot return
Can set arguments and environment variables
New program inherits open file descriptors
socket()
bind()
listen()
accept()
socket()
connect()
write()
read()
close()
Server
Client
fork()
exec()
?
def handleConnectionExec(conn_fd:Int, path:CString, args:Ptr[CString]): Unit = {
val pid = fork()
if (pid != 0) {
println("forked pid $pid to handle connection")
close(conn_fd)
return
} else {
println("fork returned $pid, in child process")
execv(path, args)
}
object Main {
def main(args: Array[String]): Unit = {
println("Content-type: text/html\r\n\r\n")
println("Hello, Strangeloop!")
}
}
# notice the FROM - AS structure
FROM scala-native-base-build AS build
# Set up the directory structure for our project
RUN mkdir -p /root/project-build/project
WORKDIR /root/project-build
# Resolve all our dependencies and plugins to speed up future compilations
ADD ./project/plugins.sbt project/
ADD ./project/build.properties project/
ADD build.sbt .
RUN sbt update
# Add and compile our actual application source code
ADD . /root/project-build/
RUN sbt clean nativeLink
# Copy the binary executable to a consistent location
RUN cp ./target/scala-2.11/*-out ./dinosaur-build-out
# Start over from a clean Alpine image, in the same Dockefile
FROM alpine:3.3
# Copy in C libraries from previous build
COPY --from=build \
/usr/lib/libunwind.so.8 \
/usr/lib/libunwind-x86_64.so.8 \
/usr/lib/libgc.so.1 \
/usr/lib/libstdc++.so.6 \
/usr/lib/libgcc_s.so.1 \
/usr/lib/
COPY --from=build \
/usr/local/lib/libre2.so.0 \
/usr/local/lib/libre2.so.0
# Copy in the executable
COPY --from=build \
/root/project-build/dinosaur-build-out /var/www/localhost/cgi-bin/app
COPY httpd.conf /etc/apache2/httpd.conf
COPY mpm.conf /etc/apache2/mpm.conf
RUN apk --update add apache2 apache2-utils
RUN mkdir -p /run/apache2
ADD apache.entrypoint.sh /root/
ENTRYPOINT "/root/apache.entrypoint.sh"
Does it work?
object main {
def main(args: Array[String]): Unit = {
Router.init()
.get("/")("<H1>Welcome to Dinosaur!</H1>")
.get("/hello") { request =>
"Hello World!"
}
.get("/who")( request =>
request.pathInfo() match {
case Seq("who") => "Who's there?"
case Seq("who",x) => "Hello, " + x
case Seq("who",x,y) => "Hello both of you"
case _ => "Hello y'all!"
}
)
.get("/bye")( request =>
request.params("who")
.map { x => "Bye, " + x }
.mkString(". ")
)
.dispatch()
}
}
trait Router {
def handle(method: Method, path:String)(f: Request => Response):Router
def get(path:String)(f: Request => Response):Router = handle(GET, path)(f)
def post(path:String)(f: Request => Response):Router = handle(POST, path)(f)
def put(path:String)(f: Request => Response):Router = handle(PUT, path)(f)
def delete(path:String)(f: Request => Response):Router = handle(DELETE, path)(f)
def dispatch(): Unit
}
case class Request(
method: Function0[Method],
pathInfo: Function0[Seq[String]],
params: Function1[String, Seq[String]]
)
case class Response(
body: ResponseBody,
statusCode: Int = 200,
headers: Map[String, String] = Map("Content-type" -> "text/html; charset=utf-8")
)
object CgiUtils {
def env(key: CString): String = {
val lookup = stdlib.getenv(key)
if (lookup == null) {
""
} else {
fromCString(lookup)
}
}
def parsePathInfo(pathInfo: String): Seq[String] = {
pathInfo.split("/").filter( _ != "" )
}
def parseQueryString(queryString: String): Function1[String, Seq[String]] = {
val pairs = queryString.split("&").map( pair =>
pair.split("=") match {
case Array(key, value) => (key,value)
}
).groupBy(_._1).toSeq
val groupedValues = for ( (k,v) <- pairs;
values = v.toSeq.map(_._2) )
yield (k -> values)
return groupedValues.toMap.getOrElse(_,Seq.empty)
}
}
case class CGIRouter(handlers:Seq[Handler]) extends Router {
def dispatch(): Unit = {
val request = Router.parseRequest()
val matches = for ( h @ Handler(method, pattern, handler) <- this.handlers
if request.method() == method
if request.pathInfo().startsWith(pattern)) yield h
val bestHandler = matches.maxBy( _.pattern.size )
val response = bestHandler.handler(request)
for ( (k,v) <- response.inferHeaders ) {
System.out.println(k + ": " + v)
}
System.out.println()
System.out.println(response.bodyToString)
}
}
40 ms mean response with 10 users
99th percentile response goes over 1s at 150 users
mean response plateaus around 500 ms at 300 users
peaks around 400 requests/sec
Compared to a python-based CGI app, which exhibits:
136 ms mean response with 10 users
99th percentile response goes over 1s at 75 users
mean response plateaus around 5s at 250 users
peaks around 200 requests/sec
But compared to a trivial node.js/Express app:
median response 7 ms with 10 users
99th percentile stays under 1s up to 2000 users
error rate approaches 15% around 500 users
peaks around 2000 requests/sec
How can we do better?
How can we do better without spending years of our lives?
Two prominent examples:
Web Server
FastCGI Application
HTTP Client
HTTP Client
HTTP Client
HTTP
FastCGI
What makes FastCGI different from regular CGI?
One catch -- we need a socket.
But do we need concurrency?
Parsing algorithm:
Read 8 byte header from socket
Extract type, Request ID, length, padding from header
Read (length + padding bytes) from socket
if (type == FCGI_STDIN & length == 0):
request is complete, invoke handler and write response
else:
append to pending buffers for Request ID
def readHeader(input: Ptr[Byte], offset:Long): RecordHeader = {
val version = input(0 + offset) & 0xFF
val rec_type = (input(1 + offset) & 0xFF) match {
case 0 => FCGI_UNKNOWN_TYPE
case 1 => FCGI_BEGIN_REQUEST
case 2 => FCGI_ABORT_REQUEST
case 3 => FCGI_END_REQUEST
case 4 => FCGI_PARAMS
case 5 => FCGI_STDIN
case 6 => FCGI_STDOUT
case 7 => FCGI_STDERR
case 8 => FCGI_DATA
case 9 => FCGI_GET_VALUES
case 10 => FCGI_GET_VALUES_RESULT
case _ => FCGI_UNKNOWN_TYPE
}
val req_id_b1 = (input(2 + offset) & 0xFF)
val req_id_b0 = (input(3 + offset) & 0xFF)
val req_id = (req_id_b1 << 8) + (req_id_b0 & 0xFF)
val length = ((input(4 + offset) & 0xFF) << 8) + (input(5 + offset) & 0xFF)
val padding = input(6 + offset) & 0xFF
RecordHeader(version,rec_type,req_id,length,padding)
}
def readParam(byteArray: Ptr[Byte], arr_offset:Long, length:Long)
: (Ptr[Byte], Ptr[Byte], Long) = {
val name_len_offset = arr_offset + 0
val (name_len:Long, val_len_offset:Long) =
if ((byteArray(name_len_offset) & 0x80) == 0) {
val len = byteArray(name_len_offset)
(len, arr_offset + 1)
} else {
val len = ((byteArray(name_len_offset) & 0x7F) << 24) +
((byteArray(name_len_offset + 1) & 0xFF) << 16) +
((byteArray(name_len_offset + 2) & 0xFF) << 8) +
(byteArray(name_len_offset + 3) & 0xFF)
(len, arr_offset + 4)
}
val (val_len:Long, content_offset:Long) =
if ((byteArray(val_len_offset) & 0x80) == 0) {
val len = byteArray(val_len_offset)
(len, val_len_offset + 1)
} else {
val len = ((byteArray(val_len_offset) & 0x7F) << 24) +
((byteArray(val_len_offset + 1) & 0xFF) << 16) +
((byteArray(val_len_offset + 2) & 0xFF) << 8) +
(byteArray(val_len_offset + 3) & 0xFF)
(len, val_len_offset + 4)
}
val name = byteArray + content_offset
val value = byteArray + content_offset + name_len
val next_param_offset = content_offset + name_len + val_len
(name, value, next_param_offset)
}
#!/bin/bash
rm /tmp/app.socket
rm /tmp/app.fifo
mkfifo /tmp/app.fifo
nginx -g "daemon off;" &
export ROUTER_MODE=FCGI
nc -l -U /tmp/app.socket < /tmp/app.fifo | /var/www/localhost/cgi-bin/dinosaur-build-out > /tmp/app.fifo
Nginx
nc
app
socket
fifo
(better option: write a proxy in ~80 lines of Go)
listener = setUpListeningSocket()
pollSet = set(listener)
while true:
readySockets = poll(pollSet)
for socket in readySockets:
if socket == listener:
newConnection = accept(listener)
pollSet.add(newConnection)
else:
if socket.readyToRead:
read(socket)
else if socket.readyToWrite:
write(socket)
LibUV, The node.js event loop:
@link("uv")
@extern
object LibUV {
type PipeHandle = Ptr[Byte]
type Loop = Ptr[Byte]
type Buffer = CStruct2[Ptr[Byte],CSize]
type WriteReq = Ptr[Ptr[Byte]]
type ShutdownReq = Ptr[Ptr[Byte]]
type Connection = Ptr[Byte]
type ConnectionCB = CFunctionPtr2[PipeHandle,Int,Unit]
type AllocCB = CFunctionPtr3[PipeHandle,CSize,Ptr[Buffer],Unit]
type ReadCB = CFunctionPtr3[PipeHandle,CSSize,Ptr[Buffer],Unit]
type WriteCB = CFunctionPtr2[WriteReq,Int,Unit]
type ShutdownCB = CFunctionPtr2[ShutdownReq,Int,Unit]
type CloseCB = CFunctionPtr1[PipeHandle,Unit]
def uv_default_loop(): Loop = extern
def uv_loop_size(): CSize = extern
def uv_handle_size(h_type:Int): CSize = extern
def uv_req_size(r_type:Int): CSize = extern
def uv_pipe_init(loop:Loop, handle:PipeHandle, ipcFlag:Int ): Unit = extern
def uv_pipe_bind(handle:PipeHandle, socketName:CString): Int = extern
def uv_listen(handle:PipeHandle, backlog:Int, callback:ConnectionCB): Int = extern
def uv_accept(server:PipeHandle, client:PipeHandle): Int = extern
def uv_read_start(client:PipeHandle, allocCB:AllocCB, readCB:ReadCB): Int = extern
def uv_write(writeReq:WriteReq, client:PipeHandle, bufs: Ptr[Buffer], numBufs: Int, writeCB:WriteCB): Int = extern
def uv_read_stop(client:PipeHandle): Int = extern
def uv_shutdown(shutdownReq:ShutdownReq, client:PipeHandle, shutdownCB:ShutdownCB): Int = extern
def uv_close(handle:PipeHandle, closeCB: CloseCB): Unit = extern
def uv_run(loop:Loop, runMode:Int): Int = extern
}
def dispatch(): Unit = {
val loop:Loop = uv_default_loop()
val pipe_size = uv_handle_size(7)
val pipe:PipeHandle = stackalloc[Byte](pipe_size)
uv_pipe_init(loop, pipe, 0)
var r = uv_pipe_bind(pipe, c"/tmp/app.socket")
println(s"uv_pipe_bind returned $r")
r = uv_listen(pipe, 4096, onConnectCB)
println(s"uv_listen returned $r")
r = uv_run(loop, 0)
println(s"uv_run returned $r")
}
def onConnect(server:PipeHandle, status:Int): Unit = {
println("connection received!")
val client:PipeHandle = stdlib.malloc(pipe_size)
uv_pipe_init(loop, client, 0)
var r = uv_accept(server, client)
println(s"uv_accept returned $r")
uv_read_start(client, onAllocCB, onReadCB)
}
val onConnectCB = CFunctionPtr.fromFunction2(onConnect)
def onRead(pipe:PipeHandle, size:CSSize, buffer:Ptr[Buffer]): Unit = {
if (size >= 0) {
var position = 0
// We are going to store the positions of the CGI parameter and STDIN frames
var params:(Int,RecordHeader) = (0,null)
var stdin:(Int,RecordHeader) = (0,null)
// Scan the input buffer for the positions of useful metadata
while (position < size) {
val header = readHeader(!buffer._1,position)
reqId = header.reqId
if (header.rec_type == FCGI_PARAMS & header.length > 0)
params = (position,header)
else if (header.rec_type == FCGI_STDIN & header.length > 0)
stdin = (position, header)
position += (8 + header.length + header.padding)
}
// Generate a response and enqueue it to the pipe (re-use the input buffer for output)
val write_req:WriteReq = stdlib.malloc(write_req_size).cast[WriteReq]
!write_req = !buffer._1
!buffer._2 = makeResponse(reqId, params, stdin, !write_req)
uv_write(write_req, pipe, buffer, 1, onWriteCB)
} else {
// or we have read 0 bytes and can close the connection
uv_read_stop(pipe)
val shutdownReq = stdlib.malloc(shutdown_req_size).cast[ShutdownReq]
!shutdownReq = pipe
uv_shutdown(shutdownReq, pipe, myShutdownCB)
stdlib.free(!buffer._1)
}
}
val onReadCB = CFunctionPtr.fromFunction3(onRead)
When deployed on a UNIX socket behind nginx:
What do our languages really need to provide?
Does serving up HTTP belong in our app or in infrastructure?
What can we expect from our OS?
What can we expect from our cluster?
Things are about to change.