David Seddon
david@seddonym.me
http://seddonym.me
Concurrent connections
table_one |
---|
|
table_two |
---|
|
table_three |
---|
|
Databases allow many processes at once to modify their data
Connection
Connection
Connection
Connection
Poll
How does Kraken handle database concurrency?
The code (version 1)
def withdraw(amount, internal_account, external_account):
"""Withdraw money from the internal account
to an external bank account.
"""
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Account balances
Customers
Accounts system: version 1
The code (version 1)
def withdraw(amount, internal_account, external_account):
"""Withdraw money from the internal account
to an external bank account.
"""
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
How concurrency breaks stuff
Worker 1
Worker 2
Source balance |
---|
£100 |
£0 |
-£50 |
Check balance >= £100
Check balance >= £50
Reduce balance by £100
Reduce balance by £50
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
Database isolation levels
SERIALIZABLE
REPEATABLE READ (MySQL default)
READ COMMITTED (PostgreSQL default)
READ UNCOMMITED
strict, accurate
permissive, fast
Transactions are a way of wrapping queries up into discrete blocks
Query
Query
Query
Transaction
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Read committed - reading
Records from other sessions will become visible as they are committed
Transaction
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
SELECT value
Transaction B reads value 0 immediately.
Isolation mode: READ COMMITTED
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Read committed - writing
Records that sessions are writing to are marked as read only immediately.
Other writers will wait until the lock is released.
Transaction
id | value | read only |
---|---|---|
1 | 0 0 1 |
NO YES NO |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 1 1 2 |
NO YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
B waits until A commits, then sets value to 2.
Select for update
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 0 1 1 2 |
NO YES YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
SELECT FOR UPDATE is a way of making
a read query behave like a write query.
SELECT FOR UPDATE
SELECT FOR UPDATE
How can Darren protect against concurrent withdrawals?
Solution: Pessimistic locking
Source balance |
---|
£100 |
£0 |
SELECT FOR UPDATE
Check balance >= £50
Reduce balance by £100
Worker 1
Worker 2
Check balance >= £100
Internal Account |
---|
22 |
SELECT FOR UPDATE
Insufficient funds!
(waits until worker 1 commits)
from django.db import transaction
from .models import InternalAccount
def withdraw(amount, internal_account, external_account):
# Wrap in a database transaction
with transaction.atomic():
# Wait for a lock on the source account
InternalAccount.objects.select_for_update().get(
id=internal_account.id
)
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Pessimistic locking in Django
1
2
Select for updates must be wrapped in an atomic transaction.
$ ./manage.py shell_plus
>>> transaction.set_autocommit(False)
>>> account = Account.objects\
.select_for_update().first()
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> transaction.commit()
Terminal A
$ ./manage.py shell_plus
>>> transaction.set_autocommit(False)
>>> account = Account.objects\
.select_for_update().first()
(HANGS...)
>>> (Prompt returns once other
session commits)
Terminal B
Pitfall 1
def transfer(source, destination, amount):
"""Make a transfer between two InternalAccounts
of the supplied amount.
"""
with transaction.atomic():
# Wait for a lock on the source and destination accounts
InternalAccount.objects.select_for_update().filter(
id__in=[source.id, destination.id]
).order_by("id")
# etc...
Spot the bug
Pitfall 1 - lazy select_for_update
Account.objects.select_for_update().filter(
id__in=[source_id, destination_id]
)
Querysets are lazy!
In this case, the select_for_update will never be run.
bool(Account.objects.select_for_update().filter(
id__in=[source_id, destination_id])
)
Solution: wrap select_for_updates that use filter in a bool if you don't evaluate them straight away.
ERROR: deadlock detected
Detail:
Process 13560 waits for ShareLock on transaction 3147316424;
blocked by process 13566.
Process 13566 waits for ShareLock on transaction 3147316408;
blocked by process 13560.
What the...?
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
Process 1
Process 2
id | read only |
---|---|
1 | YES |
2 | YES |
id | read only |
---|---|
1 | YES |
2 | YES |
Waiting for each other
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Process 1
Process 2
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Solution: when using select_for_updates on multiple records, make sure you acquire the locks in a consistent order.
Our tests are wrapped in transactions by default.
Why might this be a problem?
Does not wrap your test in a transaction. Slower, but better for code where you need to test behaviour relating to transactions.
@requires_db(transaction=True)
Unwrapped SELECT FOR UPDATEs will pass tests
but error in production!
David Seddon http://seddonym.me
These slides:
https://slides.com/davidseddon/managing-database-concurrency-with-django
ATOMIC_REQUESTS
# settings.py
ATOMIC_REQUESTS = True
The request/response cycle will be wrapped in a database transaction.
If an exception is raised, the
transaction is rolled back.
Savepoints allow you to roll back
within transactions
Query
Transaction
Query
> Savepoint
Query
Query
> Rollback
Transaction
Query
> Savepoint
Query
Query
> Savepoint
Query
> Savepoint
Query
Query
with transaction.atomic():
foo()
with transaction.atomic():
bar()
with transaction.atomic():
baz()
with transaction.atomic():
foobar()
with transaction.atomic():
foo() - Will be committed
try:
with transaction.atomic():
bar() - Will be rolled back
raise Exception
except:
pass
baz() - Will be committed
Exceptions raised within an atomic block will roll back that atomic block.