How To Do Two Things At Once
Managing Database Concurrency
With Django
David Seddon
david@seddonym.me
http://seddonym.me
Concurrent connections
table_one |
---|
|
table_two |
---|
|
table_three |
---|
|
Databases allow many processes at once to modify their data
Connection
Connection
Connection
Connection
Poll
How does Kraken handle database concurrency?
Database concurrency is something you need to think about.
Darren Sedwerb
Developer at Bankity Bank
The code (version 1)
def withdraw(amount, internal_account, external_account):
"""Withdraw money from the internal account
to an external bank account.
"""
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Account balances
Customers
Accounts system: version 1
The code (version 1)
def withdraw(amount, internal_account, external_account):
"""Withdraw money from the internal account
to an external bank account.
"""
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
How concurrency breaks stuff
Worker 1
Worker 2
Source balance |
---|
£100 |
£0 |
-£50 |
Check balance >= £100
Check balance >= £50
Reduce balance by £100
Reduce balance by £50
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
So, how do we
avoid this?
Database isolation levels
SERIALIZABLE
REPEATABLE READ (MySQL default)
READ COMMITTED (PostgreSQL default)
READ UNCOMMITED
strict, accurate
permissive, fast
Database transactions
Transactions are a way of wrapping queries up into discrete blocks
Query
Query
Query
Transaction
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Read committed - reading
Records from other sessions will become visible as they are committed
Transaction
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
SELECT value
Transaction B reads value 0 immediately.
Isolation mode: READ COMMITTED
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Read committed - writing
Records that sessions are writing to are marked as read only immediately.
Other writers will wait until the lock is released.
Transaction
id | value | read only |
---|---|---|
1 | 0 0 1 |
NO YES NO |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 1 1 2 |
NO YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
B waits until A commits, then sets value to 2.
Select for update
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 0 1 1 2 |
NO YES YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
SELECT FOR UPDATE is a way of making
a read query behave like a write query.
SELECT FOR UPDATE
SELECT FOR UPDATE
How can Darren protect against concurrent withdrawals?
Solution: Pessimistic locking
Source balance |
---|
£100 |
£0 |
SELECT FOR UPDATE
Check balance >= £50
Reduce balance by £100
Worker 1
Worker 2
Check balance >= £100
Internal Account |
---|
22 |
SELECT FOR UPDATE
Insufficient funds!
(waits until worker 1 commits)
from django.db import transaction
from .models import InternalAccount
def withdraw(amount, internal_account, external_account):
# Wrap in a database transaction
with transaction.atomic():
# Wait for a lock on the source account
InternalAccount.objects.select_for_update().get(
id=internal_account.id
)
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Pessimistic locking in Django
1
2
Select for updates must be wrapped in an atomic transaction.
See it in action
$ ./manage.py shell_plus
>>> transaction.set_autocommit(False)
>>> account = Account.objects\
.select_for_update().first()
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> transaction.commit()
Terminal A
$ ./manage.py shell_plus
>>> transaction.set_autocommit(False)
>>> account = Account.objects\
.select_for_update().first()
(HANGS...)
>>> (Prompt returns once other
session commits)
Terminal B
Bonus:
Common pitfalls
Pitfall 1
def transfer(source, destination, amount):
"""Make a transfer between two InternalAccounts
of the supplied amount.
"""
with transaction.atomic():
# Wait for a lock on the source and destination accounts
InternalAccount.objects.select_for_update().filter(
id__in=[source.id, destination.id]
).order_by("id")
# etc...
Spot the bug
Pitfall 1 - lazy select_for_update
Account.objects.select_for_update().filter(
id__in=[source_id, destination_id]
)
Querysets are lazy!
In this case, the select_for_update will never be run.
bool(Account.objects.select_for_update().filter(
id__in=[source_id, destination_id])
)
Solution: wrap select_for_updates that use filter in a bool if you don't evaluate them straight away.
Pitfall 2 - Deadlocks
ERROR: deadlock detected
Detail:
Process 13560 waits for ShareLock on transaction 3147316424;
blocked by process 13566.
Process 13566 waits for ShareLock on transaction 3147316408;
blocked by process 13560.
What the...?
How a deadlock happens
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
Process 1
Process 2
id | read only |
---|---|
1 | YES |
2 | YES |
id | read only |
---|---|
1 | YES |
2 | YES |
Waiting for each other
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
Preventing deadlocks
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Process 1
Process 2
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Solution: when using select_for_updates on multiple records, make sure you acquire the locks in a consistent order.
Pitfall 3 - Testing
Our tests are wrapped in transactions by default.
Why might this be a problem?
Does not wrap your test in a transaction. Slower, but better for code where you need to test behaviour relating to transactions.
@requires_db(transaction=True)
Unwrapped SELECT FOR UPDATEs will pass tests
but error in production!
Summary
-
Database concurrency is something you need to think about.
-
Make sure you know what isolation level your database is using, and how concurrent reading and writing is handled.
-
Select for update makes a read query behave like a write.
- Pessimistic locking is a simple way to make your code wait until it's safe.
David Seddon http://seddonym.me
These slides:
https://slides.com/davidseddon/managing-database-concurrency-with-django
Appendix
ATOMIC_REQUESTS
# settings.py
ATOMIC_REQUESTS = True
The request/response cycle will be wrapped in a database transaction.
If an exception is raised, the
transaction is rolled back.
Savepoints
Savepoints allow you to roll back
within transactions
Query
Transaction
Query
> Savepoint
Query
Query
> Rollback
Nested atomic blocks
Transaction
Query
> Savepoint
Query
Query
> Savepoint
Query
> Savepoint
Query
Query
with transaction.atomic():
foo()
with transaction.atomic():
bar()
with transaction.atomic():
baz()
with transaction.atomic():
foobar()
Nested atomic blocks
with transaction.atomic():
foo() - Will be committed
try:
with transaction.atomic():
bar() - Will be rolled back
raise Exception
except:
pass
baz() - Will be committed
Exceptions raised within an atomic block will roll back that atomic block.
Tech Talk: How to do two things at once
By David Seddon
Tech Talk: How to do two things at once
Two key concepts you can't afford to ignore
- 549