David Seddon
david@seddonym.me
http://seddonym.me
We're hiring!
https://www.growthstreet.co.uk/careers
Concurrent connections
table_one |
---|
|
table_two |
---|
|
table_three |
---|
|
Databases allow many processes at once to modify their data
Transaction
Transaction
Transaction
Transaction
How does your software handle
database concurrency?
a. The database does it automatically
b. I write code to handle it
c. Not applicable (my software has no concurrency)
d. Er, I have no idea
The code (version 1)
def withdraw(amount, internal_account, external_account):
"Withdraw money from the internal account to an external bank account."
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Account balances
Users
Accounts system: version 1
How concurrency breaks stuff
Worker 1
Worker 2
Source balance |
---|
£100 |
£0 |
-£50 |
Check balance >= £100
Check balance >= £50
Reduce balance by £100
Reduce balance by £50
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
Database isolation levels
SERIALIZABLE
REPEATABLE READ (MySQL default)
READ COMMITTED (PostgreSQL default)
READ UNCOMMITED
strict, accurate
permissive, fast
Transactions are a way of wrapping queries up into discrete blocks
Query
Query
Query
Transaction
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Read committed - reading
Records from other sessions will become visible as they are committed
Transaction
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 (1) ? |
SET value = 1
SELECT value
What happens?
Isolation mode: READ COMMITTED
Concurrency: reading
Transaction A
Transaction B
id | value |
---|---|
1 | 0 0 1 |
SET value = 1
SELECT value
Transaction B reads value 0 straight away.
Isolation mode: READ COMMITTED
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Read committed - writing
Records that sessions are writing to are marked as read only immediately.
Other writers will wait until the lock is released.
Transaction
id | value | read only |
---|---|---|
1 | 0 0 1 |
NO YES NO |
SET value = 1
id | value |
---|---|
1 | 1 |
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value |
---|---|
1 | 0 0 ? ? ? |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
What happens?
Concurrency: writing
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 1 1 2 |
NO YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
B waits until A commits, then sets value to 2.
Select for update
Transaction A (writer)
Transaction B (writer)
id | value | read only |
---|---|---|
1 | 0 0 0 1 1 2 |
NO YES YES NO YES NO |
SET value = 1
SET value = 2
Isolation mode: READ COMMITTED
SELECT FOR UPDATE is a way of making
a read query behave like a write query.
SELECT FOR UPDATE
SELECT FOR UPDATE
How can Darren protect against concurrent withdrawals?
Solution: Pessimistic locking
Source balance |
---|
£100 |
£0 |
SELECT FOR UPDATE
Check balance >= £50
Reduce balance by £100
Worker 1
Worker 2
Check balance >= £100
Internal Account |
---|
22 |
SELECT FOR UPDATE
Insufficient funds!
(waits until worker 1 commits)
from django.db import transaction
from .models import InternalAccount
def withdraw(amount, internal_account, external_account):
"Withdraw money from the internal account to an external bank account."
# Wrap in a database transaction
with transaction.atomic():
# Wait for a lock on the source account
InternalAccount.objects.select_for_update().get(id=internal_account.id)
if internal_account.has_at_least(amount):
internal_account.reduce_balance(amount)
send_money(amount, external_account)
else:
raise Exception('Insufficient funds.')
Pessimistic locking in Django
1
2
Select for updates must be wrapped in an atomic transaction.
David Seddon http://seddonym.me
These slides: https://slides.com/davidseddon/db-concurrency
Growth Street - we're hiring:
https://www.growthstreet.co.uk/careers
Pitfall 1
def transfer(source, destination, amount):
"Make a transfer between two InternalAccounts of the supplied amount."
with transaction.atomic():
# Wait for a lock on the source and destination accounts
InternalAccount.objects.select_for_update().filter(id__in=[source.id, destination.id])
# etc...
Spot the bug
Pitfall 1 - lazy select_for_update
Account.objects.select_for_update().filter(id__in=[source_id, destination_id])
Querysets are lazy!
In this case, the select_for_update will never be run.
bool(Account.objects.select_for_update().filter(id__in=[source_id, destination_id]))
Solution: wrap select_for_updates that use filter in a bool if you don't evaluate them straight away.
ERROR: deadlock detected
Detail:
Process 13560 waits for ShareLock on transaction 3147316424; blocked by process 13566.
Process 13566 waits for ShareLock on transaction 3147316408; blocked by process 13560.
What the...?
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
Process 1
Process 2
id | read only |
---|---|
1 | YES |
2 | YES |
id | read only |
---|---|
1 | YES |
2 | YES |
Waiting for each other
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids)
)
ids = [1, 2]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Process 1
Process 2
ids = [2, 1]
bool(
MyModel.objects\
.select_for_update()\
.filter(id__in=ids).
.order_by('id')
)
Solution: when using select_for_updates on multiple records, make sure you acquire the locks in a consistent order.
This wraps your test in a transaction. This can be a problem if you are testing something that doesn't run in a transaction in real life.
Example: a select_for_update included in a celery task that is not wrapped in a transaction, will pass tests but fail when it runs in production.
django.test.TestCase
Does not wrap your test in a transaction. Slower, but better for code where you need to test behaviour relating to transactions.
django.test.TransactionTestCase
ATOMIC_REQUESTS
# settings.py
ATOMIC_REQUESTS = True
The request/response cycle will be wrapped in a database transaction.
If an exception is raised, the
transaction is rolled back.
Savepoints allow you to roll back
within transactions
Query
Transaction
Query
> Savepoint
Query
Query
> Rollback
Transaction
Query
> Savepoint
Query
Query
> Savepoint
Query
> Savepoint
Query
Query
with transaction.atomic():
foo()
with transaction.atomic():
bar()
with transaction.atomic():
baz()
with transaction.atomic():
foobar()
with transaction.atomic():
foo() - Will be committed
try:
with transaction.atomic():
bar() - Will be rolled back
raise Exception
except:
pass
baz() - Will be committed
Exceptions raised within an atomic block will roll back that atomic block.
Both entries must be added, or neither.
If only one entry is added, bad things happen.
Our example accounting system
Amount |
---|
-£50 |
Source ledger
Amount |
---|
+£50 |
Destination ledger