Julio Vicente Trigo Guijarro
@juliotrigo
PyConEs 2015
Valencia, 22nd November
➜ Computer Science Engineer
➜ University of Alicante
➜ London
➜ Software Engineering
➜ Scrum / TDD
➜ Python
What we had / What we wanted / What we did NOT want
Different approaches
Our solution
Implementation
Thank you / QA
Our infrastructure
Python 2.7
Ubuntu
RDS
EC2
ELB
S3
SQS
AWS:
indexer
www
www2
www3
www4
www5
www-elb
solr-elb
solr
solr2
products
api
images
api
Marketing
Servers
Teams
Processes to run
Common resources
indexer
www
solr
Development
Product Information
Update product availability
Generate marketing feeds
Solr indexer
Import products into the DB
Update product pricing
Products DB (MySQL)
Availability DB (Redis)
Product images (S3 bucket)
Marketing feeds (S3 bucket)
➜ Growing fast
➜ Development team:
➜ Implementing services (internal use: operational)
➜ Running them manually (when requested)
➜ Expected failures (data issue...): re-run
➜ Long processes
✔ Allow other teams to run the processes
✔ Run them remotely
✔ Authentication
✔ Authorization
✔ Secuential access to common resources
➜ Semaphores
✔ Simplify process invocation
➜ Easy user interface (command line)
➜ No technical skills required
✔ Run processes asynchronously
✔ Quick solution (internal tool)
✔ Robust
✘ Give other teams SSH access to our servers
✘ Maintain server SSH keys / passwords in our services
✘ Concurrent access to common resources
✘ Our own (service invocation) implementation
➜ Development time, blocked resources, bugs...
✘ A single entity (server, service...) with access to all the servers
Give other teams access to our servers
✘ All the teams would need SSH access
✘ Cannot predict how they would use it
➜ Multiple (concurrent?) executions
➜ Unpredictable results
➜ Coordination between teams
✔ Easy user interface
✔ Teams don't need server access
✘ Single point of failure
✘ Access to all the servers
✘ Guarantee sequentiality ("common resources")
Central application that calls the remote services
Queues
✔ Central service that calls the services remotely
✔ Central service that queues messages
✔ Sequentiality handled by the queues
✔ Asynchronism
✔ No need to know services location
✔ Robust
Queues!
SQS
Boto
Control Panel (Django)
Distributed Services
Upstart
Queues (SQS)
Queues (SQS)
Boto
Control Panel (Django)
Distributed Services
Upstart
➜ AWS Management Console
➜ Push messages to the queues from any server
➜ Pull messages from the queues from any server
➜ Each critical section has its own queue
common_resources = ['products', 'availability', 'S3_feeds', 'S3_images']
➜ Represent common resources, not services
Queues (SQS)
Boto
Distributed Services
Upstart
Control Panel (Django)
➜ +46 Million downloads (top 10)
➜ Boto 3:
- "Stable and recommended for general use"
- Work is under way to support Python 3.3+
➜ Amazon Web Services SDK for Python
# sqs_handler.py
import boto.sqs
from boto.sqs.message import Message
class SqsHandler(object):
service_name = None
def __init__(
self, aws_region, aws_access_key_id, aws_secret_access_key,
aws_queue_name, logger, sleep_sec
):
self.conn = boto.sqs.connect_to_region(
aws_region,
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key
)
self.queue = self.conn.get_queue(aws_queue_name)
self.logger = logger
self.sleep_sec = sleep_sec
# ...
# ...
def mq_loop(self):
while True:
try:
message = self.queue.read()
if message is not None:
body = message.get_body()
if body["service"] == self.service_name:
self.queue.delete_message(message)
self.process_message(body)
except Exception:
self.logger.error(traceback.format_exc())
finally:
time.sleep(self.sleep_sec)
# ...
# ...
def process_message(self, body):
raise NotImplementedError()
def write_message(self, body):
m = Message()
m.set_body(body)
self.queue.write(m)
Queues (SQS)
Boto
Control Panel (Django)
Distributed Services
Upstart
➜ Django website: invoke services
➜ Django admin: authentication (HTTS) & authorization
# ...settings.base.py
# ...
GROUP_CONTROL_PANEL = 'control_panel'
GROUP_DASHBOARD = 'dashboard'
GROUP_SERVICES = 'services'
GROUP_AVAILABILITY = 'availability'
GROUP_AVAILABILITY_UPDATE = 'availability_update'
# ...
AWS_REGION = os.environ['AWS_REGION']
AWS_ACCESS_KEY_ID = os.environ['AWS_ACCESS_KEY_ID']
AWS_SECRET_ACCESS_KEY = os.environ['AWS_SECRET_ACCESS_KEY']
# ...
➜ Authorization according to permissions (Django Goups)
➜ Services invoked from different pages
➜ Message displayed after invocation
# ...settings.production.py
from .base import *
# ...
AWS_AVAILABILITY_QUEUE_NAME = 'availability'
# ...
# ...permissions.py
def group_check(user, groups):
if not user.is_authenticated():
return False
if user.is_superuser:
return True
for group in groups:
if not user.groups.filter(name=group).exists():
raise PermissionDenied
return True
def control_panel_group_check(user):
groups = [settings.GROUP_CONTROL_PANEL]
return group_check(user, groups)
# ...
# ...
def avl_update_group_check(user):
groups = [
settings.GROUP_CONTROL_PANEL,
settings.GROUP_DASHBOARD,
settings.GROUP_SERVICES,
settings.GROUP_AVAILABILITY,
settings.GROUP_AVAILABILITY_UPDATE,
]
return group_check(user, groups)
# ...
# ...views.availability.py
@require_POST
@login_required
@user_passes_test(avl_update_group_check)
def avl_update(request):
# ....
test_only = request.POST.get('test_only', False)
send_email = request.POST.get('send_email', False)
queue_name = settings.AWS_AVAILABILITY_QUEUE_NAME
body = {
"service": "availability",
"parameters": {
"test_only": test_only,
"send_email": send_email,
}
}
mq = SqsHandler(
settings.AWS_REGION,
settings.AWS_ACCESS_KEY_ID,
settings.AWS_SECRET_ACCESS_KEY,
queue_name,
logger,
settings.MQ_LOOP_SLEEP_SEC
)
mq.write_message(json.dumps(body))
# ....
➜ For each service:
- Queue name (may be shared across services)
- Payload
➜ Send messages to the correct queue
Queues (SQS)
Boto
Control Panel (Django)
Distributed Services
Upstart
➜ Read messages from the correct queue (sequentially)
➜ Check messages (service name)
➜ Delete messages
➜ Deployed on any server
➜ Process messages (service logic)
# mq_loop.py
# ...
class SqsAvlHandler(SqsHandler):
service_name = "availability"
def process_message(self, body):
message = json.loads(body)
service = message["service"]
if service == service_name:
test_only = message["parameters"]["test_only"]
send_email = message["parameters"]["send_email"]
avl(test_only, send_email)
else:
raise Exception(
"Service {0} not supported.".format(service))
# ...
# mq_loop.py
# ...
if __name__ == "__main__":
# ...
mq = SqsAvlHandler(
settings.AWS_REGION,
settings.AWS_ACCESS_KEY_ID,
settings.AWS_SECRET_ACCESS_KEY,
settings.AWS_AVAILABILITY_QUEUE_NAME,
logger,
settings.AVAILABILITY_LOOP_SLEEP_SEC
)
mq.mq_loop()
# mq_loop.sh
#!/bin/bash
source /var/.virtualenvs/avl/bin/activate
cd /var/avl
python mq_loop.py
Queues (SQS)
Boto
Control Panel (Django)
Distributed Services
Upstart
➜ Keep services up and running
➜ Debian and Ubuntu moved to systemd
➜ "Upstart is an event-based replacement for the /sbin/init daemon which handles starting of tasks and services during boot, stopping them during shutdown and supervising them while the system is running."
# /etc/init/avl.conf
description "Availability update"
# Start up when the system hits any normal runlevel, and
# shuts down when the system goes to shutdown or reboot.
#
# 0 : System halt.
# 1 : Single-User mode.
# 2 : Graphical multi-user plus networking (DEFAULT)
# 3 : Same as "2", but not used.
# 4 : Same as "2", but not used.
# 5 : Same as "2", but not used.
# 6 : System reboot.
start on runlevel [2345]
stop on runlevel [06]
# respawn the job up to 10 times within a 5 second period.
# If the job exceeds these values, it will be stopped and
# marked as failed.
respawn
respawn limit 10 5
setuid avl
setgid avl
exec /var/avl/mq_loop.sh
/etc/init/avl.conf
/etc/init.d/avl > /lib/init/upstart-job
➜ Internal use
➜ A few messages per day
➜ Expected exceptions processing messages
"The term "Microservice Architecture" has sprung up over the last few years to describe a particular way of designing software applications as suites of independently deployable services. While there is no precise definition of this architectural style, there are certain common characteristics around organization around business capability, automated deployment, intelligence in the endpoints, and decentralized control of languages and data." - Martin Fowler
http://martinfowler.com/articles/microservices.html
➜ Framework for building microservices in Python
➜ Built-in support for:
➜ RPC over AMQP
➜ Asynchronous events (pub-sub) over AMQP
➜ Simple HTTP GET and POST
➜ Websocket RPC and subscriptions (experimental)
➜ Encourages the dependency injection pattern
# helloworld.py
from nameko.rpc import rpc
class GreetingService(object):
name = "greeting_service"
@rpc
def hello(self, name):
return "Hello, {}!".format(name)
$ nameko run helloworld
starting services: greeting_service
...
$ nameko shell
>>> n.rpc.greeting_service.hello(name="Julio")
u'Hello, Julio!'
# http.py
import json
from nameko.web.handlers import http
class HttpService(object):
name = "http_service"
@http('GET', '/get/<int:value>')
def get_method(self, request, value):
return json.dumps({'value': value})
@http('POST', '/post')
def do_post(self, request):
return "received: {}".format(request.get_data(as_text=True))
$ nameko run http
starting services: http_service
...
$ curl -i localhost:8000/get/42
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 13
Date: Fri, 13 Feb 2015 14:51:18 GMT
{'value': 42}
$ curl -i -d "post body" localhost:8000/post
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Content-Length: 19
Date: Fri, 13 Feb 2015 14:55:01 GMT
received: post body
@juliotrigo
https://slides.com/juliotrigo/pycones2015-distributed-services