Ad serving with Python

About me

Started programming games in 5th grade

Fight legacy php/mysql applications by day

Python freelancer by afternoon

Adserver requirements

Fast
Efficient
Low latency
Did I say fast?

Why python

A joy to work with
Wrappers for many C/C++ extensions
Fast enough*

* in this case

Ad serving components

Local ads db updater
Keep an in-memory copy of the ads from master rdbms
HTTP request-response blocking webapp
Only serve ads, never contact with outside the server
Statistics updater
Update the statistics every x seconds on master db

Enter Local ads db updater

Must keep a copy of all ads locally
Too much latency going to postgresql
Need ability to have indexes or emulate indexes (transactions)
We don't want to iterate on thousand of ads on each request
Sharable on multiple processes
Don't want to duplicate ads for each worker
Updates only from Local Db Updater
Must be fast
It should not be slow

After a lot of searching...

symas/lmdb

Sorted-map interface (supports range lookups)
Use this for range queries to implement indexes
Fully transactional, full ACID semantics with MVCC
Indexes and data can be consistent
Multiple subdatabases with spanning transactions
Using a db for each index
Supports multi-thread and multi-process concurrency
Only 1 write transaction at a time
Reads are extremly cheap
py-lmdb wrapper module for python
keys and values are str(py2) / bytes(py3)
Need a way to serialize the objects.... capnproto

Indexes for fast filtering

Need ability to query local db with:
SELECT * FROM ads WHERE (('NL' IN country_included AND 'NL' NOT IN country_excluded)OR no_country_restriction=True )
The same query for many dimensions like device, city, custom-key-values etc.
Need ability to join/merge indexes

Indexes (1st idea)

Keep a btree index for each dimension
Get back a list of ids
Convert the list into a set
Intersect the sets to find only ads that satisfy all queries

Too many things to do on each request, must precompute indexes more

Indexes (2nd idea)

Watching at open-source dbs, found 2 cases of index intersection

Postgresql

Scan both btree indexes and build a bitmap
Intersect the bitmaps

Lucene / Elastic-search

Scan both inverted indexes and build a bitset
Intersect the bitsets

Found a pattern, searching for a module to keep a set of integers as a bitset + provide fast functions to intersect, substract etc... intbitset!

Intbitset (c-based extension)

Provides an intbitset data object holding unordered sets of unsigned integers with ultra fast set operations, implemented via bit vectors and Python C extension to optimize speed and memory usage.
50X - 5000X faster than python sets with positive integers
Acts like a set with extra functions
Supports iterator protocol

Intbitset (sample)

Needs ads that don't deny NL, require NL, or don't care about country targeting.
Keeping an intbitset for:
0.all ads in this zone_id/position
1.ads that require NL
2.ads that deny NL
3.ads that don't have country-targeting

Combine the bitsets:
(ids from zone AND (ids from 1 UNION ids from 2)) REMOVE ids from 3

Problem

When the ‘id’ of ads is incremental, after a couple of months, the intbitsets will get less efficient because they will become more sparse, since valid ids will start from ex: 50 000. And some ads will complete fast (hence be removed from the intbitset) and some ads will stay active for a long time.

Keep an internal ad-id -> smallest-positive-integer on the lmdb with free lists so you don’t have sparse intbitsets (lower memory + faster computation).

Solution

End of local db update

Scan the master db all in an interval, and update the local ads db + all the indexes

Use lmdb for storing all the data
Transactions keeps everything consistent

Use capnproto to serialize/deserialize ad objects

Keep most of the indexes in intbitsets

Some of the indexes must be btrees, ex:
frequency_capping --> ad_id

Adserving component

Only use cheap read-transactions from local ads db

Use maxmind c extension for geoip

Use 51degrees c extension for device detection

Before serving the ad, do a very small write transaction to log the impression (in a separate statistics db)

Statistics component

Wake up every x seconds to synchronize the logged impressions from local db with master db

Using lmdb with no durability, if the server crashes the data is lost

Adding more durability slows things down because disk-access is slow

Extra: Have another process that does a group-commit to hdd (ex: every say 0.5 seconds)

Using UWSGI to host the app

Cache – key-value memory-mapped cache.
Used to cache zones and other non transactional data.
Increments used to keep statistics on each server.
Mules – Managed python processes that do not serve http-requests but can be contacted from workers
Programmed Mules - Managed python processes that just do an infinite loop
Used to run the Local Db Updater + Statistics Process
Http – Async http server. Less features and ~10% slower than nginx, but enough.
@cron() decorator, cron-like, but for python functions.
Geoip module: The webserver does the geoip.

C extensions

Most of the modules:

lmdb
capnproto
geoip
uwsgi
device-tracker
intbitset

are actually wrappers of c/c++ extensions. If the app has immense success, and a c/c++ port will become inevitable, it will be easier to make the initial translation *

* maybe.

The end

Thank you

Questions ?

https://github.com/ddorian

dorian.hoxha@gmail.com

Ad serving with Python / Uwsgi / lmdb / Capnproto

By Dorian Hoxha

Ad serving with Python / Uwsgi / lmdb / Capnproto