Python     twisted


Tomasz Ducin
16th October 2013,  Warsaw

KEY FACTs

  • library
  • networking
  • asynchronous (event-driven)
  • open source (MIT license)

Figures

Benefits

  • facilitates network applications development (avoiding boilerplate, e.g. logging, authorization)
  • support for test-driven development
  • support for production-grade deployment
  • separation of concerns (transport, protocol, resources,  applications)
  • well-known protocols implementation
  • and more...

Well-known protocols

installation

sudo apt-get install python-twisted
dependency: zope.interface

optionally: pyOpenSSL for SSL, PyCrypto for SSH

echo server

is the networking 'hello world' app

code snippets available at:
https://github.com/tkoomzaaskz/seminar-twisted

Twisted echo server

Echo server (http://twistedmatrix.com)
echo-server/echo-server-twisted.py
from twisted.internet import protocol, reactor

class Echo(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write(data)

class EchoFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Echo()

reactor.listenTCP(1234, EchoFactory())
reactor.run() 

asyncore

http://docs.python.org/2/library/asyncore.html
echo-server/echo-server-asyncore.py
import asyncore
import socket

class EchoHandler(asyncore.dispatcher_with_send):
    def handle_read(self):
        data = self.recv(8192)
        if data:
            self.send(data)

class EchoServer(asyncore.dispatcher):
    def __init__(self, host, port):
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.set_reuse_addr()
        self.bind((host, port))
        self.listen(5)

    def handle_accept(self):
        pair = self.accept()
        if pair is not None:
            sock, addr = pair
            print 'Incoming connection from %s' % repr(addr)
            handler = EchoHandler(sock)

server = EchoServer('localhost', 8080)
asyncore.loop() 
note signal handler

Why asynchronous programming?

compare 3 models:

asynchronous workflow

requires non-blocking I/O operations

twisted key elements


  • reactor
  • transport
  • protocols
  • protocol factories

Reactor

from twisted.internet import reactor 
  • event loop, the core of twisted
  • knows about network, filesystem and timer events
  • abstracts away platform-specific behavior
  • design pattern
reactor/run.py
from twisted.internet import reactor
print "Starting the event loop"
reactor.run()
print "Finished the event loop"
what will happen? what if we send a sigint?

example: two servers, one thread

echo-server/double-echo.py
class Hello(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write('Hello, ' + data)

class HelloFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Hello()

class Bonjour(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write('Bonjour, ' + data)

class BonjourFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Bonjour()

reactor.listenTCP(1234, HelloFactory())
reactor.listenTCP(1235, BonjourFactory())
reactor.run()

Reactor very basic API

  • core: running
  • core: run()
  • core: stop()
  • time: callLater(delay, callable, *args, **kwargs)
  • tcp: listenTCP(port, serverFactory)
  • tcp: connectTCP(host, port, clientFactory)
  • tls: listenSSL/connectSSL
  • process: spawnProcess(processProtocol, exec, ...)
  • threads: callInThread(callable, *args, **kwargs)
  • more at twisted docs

Manual reactor management #1

reactor/set_timeout.py
from twisted.internet import reactor
import time

def printTime():
    print "Current time is", time.strftime("%H:%M:%S")

def stopReactor():
    print "Stopping reactor"
    reactor.stop()

reactor.callLater(1, printTime)
reactor.callLater(2, printTime)
reactor.callLater(3, printTime)
reactor.callLater(4, printTime)
reactor.callLater(5, stopReactor)

print "Starting the event loop"
reactor.run()
print "Finished the event loop"

manual reactor management #2

reactor/set_interval.py
from twisted.internet import reactor

counter = 5

def hello(name):
    global counter
    if counter:
        print 'hello', name
        counter -= 1
        reactor.callLater(1, hello, 'world')
    else:
        reactor.stop()

reactor.callLater(1, hello, 'world')
reactor.run()

Browsers (comparison)

1. browsers provide asynchronous event API, such as:
... and so twisted reactor has analogous API (e.g. Window.setTimeout/Interval vs reactor.callLater)

2. a browser tab has got its own event loop (1 thread) 
... and so does a twisted process

3. JS programmers cannot access the event loop directly
... but twisted programmers can!

Reactor types

$ twistd --help-reactors
    kqueue      kqueue(2)-based reactor.
    win32       Win32 WaitForMultipleObjects-based reactor.
    epoll       epoll(4)-based reactor. // default
    iocp        Win32 IO Completion Ports-based reactor.
    gtk         Gtk1 integration reactor.
    cf          CoreFoundation integration reactor.
    gtk2        Gtk2 integration reactor.
    default     The best reactor for the current platform.
    debug-gui   Semi-functional debugging/introspection reactor.
    poll        poll(2)-based reactor.
    glib2       GLib2 event-loop integration reactor.
    select      select(2)-based reactor.
    wx          wxPython integration reactor.
    qt          QT integration reactor

Transports & Protocols

separation of concerns

transport - communicates 2 endpoints over network
protocol - processes network events asynchronously

Transport api

  • write write data to the physical connection in a nonblocking manner.
  • writeSequencewrite a list of strings to the physical connection. Useful when working with line-oriented protocols.
  • loseConnectionwrite all pending data and then close the connection.
  • getPeerget the remote address of the connection.
  • getHostlike getPeer, but returns the address of the local side of the connection.

connection methods and attributes

protocol api

  • makeConnectioncreate a connection between two endpoints across a transport.
  • connectionMadecalled when a connection to another endpoint is made.
  • dataReceivedcalled when data is received across a transport.
  • connectionLostcalled when the connection is shut down.

connection events

compare protocols with Apache Thrift

Protocol: instances & factories

  • twisted protocol != protocols like HTTP, FTP, etc.
  • each protocol instance handles events around one network connection
  • protocol factory creates protocol instances for each connection
  • protocol instance holds connection-level logic (such as state; protocol state machine), persistent protocol state
  • protocol factory holds server-level logic

protocol state machine

server/state-machine.py
from twisted.internet.protocol import Factory
from twisted.protocols.basic import LineReceiver
from twisted.internet import reactor

class ChatProtocol(LineReceiver):
    def __init__(self, factory):
        self.factory = factory
        self.name = None
        self.state = "REGISTER"
    def connectionMade(self):
        self.sendLine("What's your name?")
    def connectionLost(self, reason):
        if self.name in self.factory.users:
            del self.factory.users[self.name]
            self.broadcastMessage("%s has left the channel." % (self.name,))
    def lineReceived(self, line):
        if self.state == "REGISTER":
            self.handle_REGISTER(line)
        else:
            self.handle_CHAT(line)
    def handle_REGISTER(self, name):
        if name in self.factory.users:
            self.sendLine("Name taken, please choose another.")
            return
        self.sendLine("Welcome, %s!" % (name,))
        self.broadcastMessage("%s has joined the channel." % (name,))
        self.name = name
        self.factory.users[name] = self
        self.state = "CHAT"
    def handle_CHAT(self, message):
        message = "<%s> %s" % (self.name, message)
        self.broadcastMessage(message)
    def broadcastMessage(self, message):
        for name, protocol in self.factory.users.iteritems():
            if protocol != self:
                protocol.sendLine(message)

class ChatFactory(Factory):
    def __init__(self):
        self.users = {}
    def buildProtocol(self, addr):
        return ChatProtocol(self)

reactor.listenTCP(8000, ChatFactory())
reactor.run()

protocol factory

custom protocol class vs custom protocol factory

from twisted.internet import protocol, reactor

class Hello(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write('Hello, ' + data)
// either this
helloFactory = protocol.Factory()
helloFactory.protocol = Hello
// or this
class HelloFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Hello()
reactor.listenTCP(1234, HelloFactory()) // or helloFactory
reactor.run()

Protocol factory api

twisted.internet.protocol.Factory
  • buildProtocol()
  • hooks: startFactory()stopFactory()

twisted.internet.protocol.ClientFactory:
  • clientConnectionLost()
  • clientConnectionFailed()

Protocol factory reference

publish/subscribe example (twistedmatrix.com)
<gist>/publish-subscribe.py
from twisted.internet import reactor, protocol
from twisted.protocols import basic

class PubProtocol(basic.LineReceiver):
    def __init__(self, factory):
        self.factory = factory
    def connectionMade(self):
        self.factory.clients.add(self)
    def connectionLost(self, reason):
        self.factory.clients.remove(self)
    def lineReceived(self, line):
        for c in self.factory.clients:
            c.sendLine("<{}> {}".format(self.transport.getHost(), line))

class PubFactory(protocol.Factory):
    def __init__(self):
        self.clients = set()
    def buildProtocol(self, addr):
        return PubProtocol(self)

reactor.listenTCP(1025, PubFactory())
reactor.run()

chat room

https://gist.github.com/tkoomzaaskz/6967092
<gist>/chat-server.py

implement bomb explosion:
  • detonate the bomb while everyone is in the chat room
  • set bomb explosion delay and safely leave the chat room

recv

  • twisted.internet.protocol.Protocol:
def dataReceived(self, data)
  • twisted.protocols.basic.LineReceiver:
def lineReceived(self, line)

send

  • twisted.internet.protocol.Protocol:
self.transport.write(data)
twisted.protocols.basic.LineReceiver:
self.sendLine(line)

handling binary data

python struct module - interpret strings as packed binary data

This module performs conversions between Python values and C structs represented as Python strings. This can be used in handling binary data stored in files or from network connections, among other sources.

example:
  • first 4 Bytes define incomming message length (in Bytes)
  • use protocol state machine to receive whole message

echo client

client/echo.py
from twisted.internet import reactor, protocol

class EchoClient(protocol.Protocol):
    def connectionMade(self):
        self.transport.write("Hello, world!")
    def dataReceived(self, data):
        print "Server said:", data
        self.transport.loseConnection()

class EchoFactory(protocol.ClientFactory):
    def buildProtocol(self, addr):
        return EchoClient()
    def clientConnectionFailed(self, connector, reason):
        print "Connection failed."
        reactor.stop()
    def clientConnectionLost(self, connector, reason):
        print "Connection lost."
        reactor.stop()

reactor.connectTCP("localhost", 8080, EchoFactory())
reactor.run()

building clients and servers

  1. Define a protocol class, subclassing twisted.internet.protocol.Protocol for arbitrary data or twisted.protocols.basic.LineReceiver for line-oriented protocols.
  2. Define a protocol factory class, subclassing twisted.internet.protocol.Factory for servers and twisted.internet.protocol.ClientFactory for clients. That factory creates instances of the protocol and stores state shared across protocol instances.
  3. Clients use reactor.connectTCP to initiate a connection to a server. Invoking connectTCP registers callbacks with the reactor to notify your protocol when new data has arrived across a socket for processing. Servers use reactor.listenTCP to listen for and respond to client connections.
  4. Communication doesn’t start until reactor.run is called, which starts the reactor event loop.

Deferreds

  • is a promise to execute code later
  • later means when a resource becomes available
  • non-blocking functions return immediately and trigger deferred execution
  • design pattern, simply explained

  • Deferreds do help you write asynchronous code.
  • Deferreds do not automatically make code asynchronous or nonblockingTo turn a synchronous function into an asynchronous function, it’ll need to be refactored to return a Deferred with which callbacks are registered.

Callbacks and Errbacks


basic examples

deferred/callback.py
from twisted.internet.defer import Deferred

def myCallback(result):
    print result

d = Deferred()
d.addCallback(myCallback)
d.callback("Triggering callback.")
deferred/errback.py
from twisted.internet.defer import Deferred

def myErrback(failure): print failure d = Deferred() d.addErrback(myErrback) d.errback("Triggering errback.")
deferred needs to be fired (but usually the reactor does it)

debugging deferreds

  • debugging asynchronous apps can be painful
  • thankfully, twisted deferreds provide exception asynchronous stack trace (example)
>>> from twisted.internet.defer import Deferred as D
>>> def start_app(_):
...     #import os
...     return os.startfile('sasa')
... 
... def command_die(err):
...     err.printTraceback()
... 
...     
... d = D()
... d.addCallback(start_app)
... d.addErrback(command_die)
... d.callback(0)
Traceback (most recent call last):
  File "C:\Users\Pilyavskiy\AppData\Local\DreamPie\share\dreampie\subp-py2\dreampielib\subprocess\__init__.py", line 324, in execute
    exec codeob in self.locs
  File "<pyshell#3>", line 12, in <module>
    d.callback(0)
  File "C:\pill\Python27\lib\site-packages\twisted\internet\defer.py", line 361, in callback
    self._startRunCallbacks(result)
  File "C:\pill\Python27\lib\site-packages\twisted\internet\defer.py", line 455, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "C:\pill\Python27\lib\site-packages\twisted\internet\defer.py", line 542, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "<pyshell#3>", line 3, in start_app
    return os.startfile('sasa')
exceptions.NameError: global name 'os' is not defined

deferreds api

  • addCallbacksregisters a callback with the callback chain and an errback with the errback chain, at the same level
  • addCallbackregisters a callback with the callback chain and a pass-through with the errback chain, which simply returns the result passed to it
  • addErrbackregisters an errback with the errback chain and a pass-through with the callback chain
  • addBothadd the same callback to both the callback and errback chains for the Deferred. The analogous synchronous logic is the finally part of a try/except/finally block.

deferred vs threads

for blocking I/O operations:
  • twisted.internet.threads.deferToThread(f, *args, **kwargs) - run a function in a thread and return the result as a Deferred.

Web servers & clients

import twisted.web
features
  • HTTP protocol support
  • HTTP 1.1, HTTPS client/server implementations
  • proxy support
  • WSGI implementation
  • basic HTML templating

Twisted web echo

web/webecho.py
from twisted.protocols import basic
from twisted.internet import protocol, reactor

class HttpEchoProtocol(basic.LineReceiver):
    
    def __init__(self):
        self.lines = []
        self.gotRequest = False

    def lineReceived(self, line):
        self.lines.append(line)
        if not line and not self.gotRequest:
            self.sendResponse()
            self.gotRequest = True

    def sendResponse(self):
        responseBody = "You said:\r\n\r\n" + "\r\n".join(self.lines)
        self.sendLine("HTTP/1.0 200 OK")
        self.sendLine("Content-Type: text/plain")
        self.sendLine("Content-Length: %i" % len(responseBody))
        self.sendLine("")
        self.transport.write(responseBody)
        self.transport.loseConnection()

f = protocol.ServerFactory()
f.protocol = HttpEchoProtocol
reactor.listenTCP(8000, f)
reactor.run()
 

Twisted web server

web/server.py
from twisted.web import server, resource
from twisted.internet import reactor

class HelloResource(resource.Resource):
    isLeaf = True
    numberRequests = 0
    
    def render_GET(self, request):
        self.numberRequests += 1
        request.setHeader("content-type", "text/plain")
        return "I am request #" + str(self.numberRequests) + "\n"

reactor.listenTCP(8080, server.Site(HelloResource()))
reactor.run() 

web static content

web/static-content.py
from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.static import File

resource = File('/home')
factory = Site(resource)
reactor.listenTCP(8000, factory)
reactor.run()

web dynamic content

web/dynamic-content.py
from twisted.internet import reactor
from twisted.web.resource import Resource
from twisted.web.server import Site
import time

class ClockPage(Resource):
    isLeaf = True
    def render_GET(self, request):
        return "The local time is %s" % (time.ctime(),)

resource = ClockPage()
factory = Site(resource)
reactor.listenTCP(8000, factory)
reactor.run()

synchronous responses

time.sleep is blocking / open several browser tabs
web/blocking.py
from twisted.internet import reactor
from twisted.web.resource import Resource
from twisted.web.server import Site
import time

class BusyPage(Resource):
    isLeaf = True
    def render_GET(self, request):
        time.sleep(5)
        return "Finally done, at %s" % (time.asctime(),)

factory = Site(BusyPage())
reactor.listenTCP(8000, factory)
reactor.run()

asynchronous responses

use deferreds!
web/nonblocking.py
from twisted.internet.task import deferLater
from twisted.web.resource import Resource
from twisted.web.server import Site, NOT_DONE_YET
import time

class BusyPage(Resource):
    isLeaf = True
    def _delayedRender(self, request):
        request.write("Finally done, at %s" % (time.asctime(),))
        request.finish()
    def render_GET(self, request):
        d = deferLater(reactor, 5, lambda: request)
        d.addCallback(self._delayedRender)
        return NOT_DONE_YET

factory = Site(BusyPage())
reactor.listenTCP(8000, factory)
reactor.run()

Production grade tools

boilerplate-less
cross-platform
scalable
customizable (daemonization, custom reactor, profiling)

Twisted application infrastructure

  • services
  • applications
  • TAC files
  • twistd
TAC = twisted application configuration

twistd

$ twistd --help
Usage: twistd [options]
Options:
      --savestats      save the Stats object rather than the text output of the
                       profiler.
  -o, --no_save        do not save state on shutdown
  -e, --encrypted      The specified tap/aos file is encrypted.
  -n, --nodaemon       don't daemonize, don't use default umask of 0077
      --originalname   Don't try to change the process name
      --syslog         Log to syslog, not to file
      --euid           Set only effective user-id rather than real user-id.
                       (This option has no effect unless the server is running
                       as root, in which case it means not to shed all
                       privileges after binding ports, retaining the option to
                       regain privileges in cases such as spawning processes.
                       Use with caution.)
  -l, --logfile=       log to a specified file, - for stdout
  -p, --profile=       Run in profile mode, dumping results to specified file
      --profiler=      Name of the profiler to use (profile, cprofile, hotshot).
                       [default: hotshot]
  -f, --file=          read the given .tap file [default: twistd.tap]
  -y, --python=        read an application from within a Python file (implies
                       -o)
  -s, --source=        Read an application from a .tas file (AOT format).
  -d, --rundir=        Change to a supplied directory before running [default:
                       .]
      --prefix=        use the given prefix when syslogging [default: twisted]
      --pidfile=       Name of the pidfile [default: twistd.pid]
      --chroot=        Chroot to a supplied directory before running
  -u, --uid=           The uid to run as.
  -g, --gid=           The gid to run as.
      --umask=         The (octal) file creation mask to apply.
      --help-reactors  Display a list of possibly available reactor names.
      --version        Print version information and exit.
      --spew           Print an insanely verbose log of everything that happens.
                       Useful when debugging freezes or locks in complex code.
  -b, --debug          Run the application in the Python Debugger (implies
                       nodaemon), sending SIGUSR2 will drop into debugger
  -r, --reactor=       Which reactor to use (see --help-reactors for a list of
                       possibilities)
      --help           Display this help and exit.

echo application demo

prepare protocols and factories:
tac/echo.py
from twisted.internet import protocol, reactor

class Echo(protocol.Protocol):
    def dataReceived(self, data):
        self.transport.write(data)

class EchoFactory(protocol.Factory):
    def buildProtocol(self, addr):
        return Echo() 
this will be used by TAC

echo application demo

configure application inside TAC:
tac/echo_server.tac
from twisted.application import internet, service
from echo import EchoFactory

application = service.Application("echo")
echoService = internet.TCPServer(8000, EchoFactory())
echoService.setServiceParent(application)
manage
$ twistd -y echo_server.tac # start
$ cat twistd.log # log file
$ cat twistd.pid # process id
# kill manually 

Twisted plugins

  • alternative to TAC-based systems
  • TAP = twisted application plugin
  • extend command-line interface
  • register custom services as twistd subcommands
  • plugin discoverability is codified

echo plugin

from zope.interface import implements
from twisted.application.service import IServiceMaker
from twisted.application import internet
from twisted.plugin import IPlugin
from twisted.python import usage
from echo import EchoFactory

class Options(usage.Options):
    optParameters = [["port", "p", 8000, "The port number to listen on."]]

class EchoServiceMaker(object):
    implements(IServiceMaker, IPlugin)
    tapname = "echo"
    description = "A TCP-based echo server."
    options = Options
    def makeService(self, options):
        return internet.TCPServer(int(options["port"]), EchoFactory())

serviceMaker = EchoServiceMaker() 

  • twisted.python.usage.Options, --port
  • implement: IPlugin and IServiceMaker
  • makeService: command-line config

echo plugin demo

Commands:
    news             A news server.
    ftp              An FTP server.
    telnet           A simple, telnet-based remote debugging service.
    socks            A SOCKSv4 proxy service.
    manhole-old      An interactive remote debugger service.
    portforward      A simple port-forwarder.
    web              A general-purpose web server which can serve from a
                     filesystem or application resource.
    inetd            An inetd(8) replacement.
    echo             A TCP-based echo server.
    xmpp-router      An XMPP Router server
    words            A modern words server
    dns              A domain name server.
    mail             An email service
    manhole          An interactive remote debugger service accessible via
                     telnet and ssh and providing syntax coloring and basic line
                     editing functionality.
    conch            A Conch SSH service.
    procmon          A process watchdog / supervisor

Built-in twistd tools

  • both static and dynamic content:
twistd web --port 8080 --path .
  • DNS server cutting out social media:

twistd dns -v -p 5553 --hosts-file=hosts
# /etc/hosts-alike format127.0.0.1 facebook.com
127.0.0.1 twitter.com
127.0.0.1 reddit.com 
  • ESMTP (extended SMTP) accepting emails for localhost and storing in emails directory:
twistd mail -E -H localhost -d localhost=emails
  • SSH server:
twistd conch -p tcp:2222

Pycached

memcached-alike twisted-based key-value storage
Command Arguments Description
version - returns pycached server version
count - returns number of cache entries
clear - removes all cache entries
items - returns all cache entries
status - returns pycached server status (uptime)
get key returns cache entry for a given key
set key, value sets/overwrites cache entry for a given key with a given value
delete key deletes cache entry for a given key

PyCached

2 services:
  • (main) service
  • http

Sources

resources

publications

http://shop.oreilly.com/product/9780596100322.do

Thanks