How To Do Two Things At Once

Managing Database Concurrency
With Django

 

 

David Seddon

david@seddonym.me

http://seddonym.me

Concurrent connections

table_one





 
table_two





 
table_three





 

Databases allow many processes at once to modify their data

Connection

Connection

Connection

Connection

Poll

 

How does Kraken handle database concurrency?

Database concurrency is something you need to think about.

Darren Sedwerb

Developer at Bankity Bank

The code (version 1)

def withdraw(amount, internal_account, external_account):
    """Withdraw money from the internal account
    to an external bank account.
    """

    if internal_account.has_at_least(amount):
         internal_account.reduce_balance(amount)
         send_money(amount, external_account)
    else:
        raise Exception('Insufficient funds.')
        

Account balances

Customers

Accounts system: version 1

The code (version 1)

def withdraw(amount, internal_account, external_account):
    """Withdraw money from the internal account
    to an external bank account.
    """

    if internal_account.has_at_least(amount):
         internal_account.reduce_balance(amount)
         send_money(amount, external_account)
    else:
        raise Exception('Insufficient funds.')
        

How concurrency breaks stuff 

Worker 1

Worker 2

Source balance
£100





 
£0
 
-£50
 

Check balance >= £100

Check balance >= £50

Reduce balance by £100

Reduce balance by £50

if internal_account.has_at_least(amount):
     internal_account.reduce_balance(amount)
     send_money(amount, external_account)

So, how do we
avoid this?

Database isolation levels

SERIALIZABLE

 

REPEATABLE READ (MySQL default)

 

READ COMMITTED (PostgreSQL default)

 

READ UNCOMMITED

strict, accurate

permissive, fast

Database transactions

Transactions are a way of wrapping queries up into discrete blocks

Query

Query

Query

Transaction

Concurrency: reading

Transaction A

Transaction B

id value
1 0

(1)

?

SET value = 1

SELECT value

What happens?

Isolation mode: READ COMMITTED

Read committed - reading

Records from other sessions will become visible as they are committed

Transaction

id value
1 0

0

1

SET value = 1

id value
1 1

Concurrency: reading

Transaction A

Transaction B

id value
1 0

(1)

?

SET value = 1

SELECT value

What happens?

Isolation mode: READ COMMITTED

Concurrency: reading

Transaction A

Transaction B

id value
1 0

0


1

SET value = 1

SELECT value

Transaction B reads value 0 immediately.

 

Isolation mode: READ COMMITTED

Concurrency: writing

Transaction A (writer)

Transaction B (writer)

id value
1 0

0

?

?
?
 

SET value = 1

SET value = 2

Isolation mode: READ COMMITTED

What happens?

Read committed - writing

Records that sessions are writing to are marked as read only immediately.

Other writers will wait until the lock is released.

Transaction

id value read only
1 0

0

1
NO

YES

NO

SET value = 1

id value
1 1

Concurrency: writing

Transaction A (writer)

Transaction B (writer)

id value
1 0

0

?

?
?
 

SET value = 1

SET value = 2

Isolation mode: READ COMMITTED

What happens?

Concurrency: writing

Transaction A (writer)

Transaction B (writer)

id value read only
1 0

0




1

1

2
NO

YES




NO

YES

NO

SET value = 1

SET value = 2

 

 

Isolation mode: READ COMMITTED

B waits until A commits, then sets value to 2.

Select for update

Transaction A (writer)

Transaction B (writer)

id value read only
1 0
0


0

1


1
2
NO
YES


YES

NO


YES
NO

SET value = 1

SET value = 2

Isolation mode: READ COMMITTED

SELECT FOR UPDATE is a way of making
a read query behave like a write query.

SELECT FOR UPDATE

SELECT FOR UPDATE

 

 

 

How can Darren protect against concurrent withdrawals?

Solution: Pessimistic locking

Source balance
£100
 
£0



SELECT FOR UPDATE

Check balance >= £50

Reduce balance by £100

Worker 1

Worker 2

Check balance >= £100

Internal Account
22
 

 

SELECT FOR UPDATE

 

Insufficient funds!

(waits until worker 1 commits)

from django.db import transaction
from .models import InternalAccount


def withdraw(amount, internal_account, external_account):

    # Wrap in a database transaction
    with transaction.atomic():

        # Wait for a lock on the source account
        InternalAccount.objects.select_for_update().get(
            id=internal_account.id
        )
    
        if internal_account.has_at_least(amount):
             internal_account.reduce_balance(amount)
             send_money(amount, external_account)
        else:
            raise Exception('Insufficient funds.')
        

Pessimistic locking in Django

1

2

Select for updates must be wrapped in an atomic transaction.

See it in action

$ ./manage.py shell_plus

>>> transaction.set_autocommit(False)

>>> account = Account.objects\
.select_for_update().first()
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> transaction.commit()

Terminal A

$ ./manage.py shell_plus

>>> transaction.set_autocommit(False)




>>> account = Account.objects\
.select_for_update().first()


(HANGS...)


>>> (Prompt returns once other
session commits)

Terminal B

Bonus:

Common pitfalls

Pitfall 1


def transfer(source, destination, amount):
    """Make a transfer between two InternalAccounts
    of the supplied amount.
    """

    with transaction.atomic():

        # Wait for a lock on the source and destination accounts
        InternalAccount.objects.select_for_update().filter(
            id__in=[source.id, destination.id]
        ).order_by("id")

        # etc...

Spot the bug

Pitfall 1 - lazy select_for_update

Account.objects.select_for_update().filter(
    id__in=[source_id, destination_id]
)

Querysets are lazy!

In this case, the select_for_update will never be run.

bool(Account.objects.select_for_update().filter(
    id__in=[source_id, destination_id])
)

Solution: wrap select_for_updates that use filter in a bool if you don't evaluate them straight away.

Pitfall 2 - Deadlocks

ERROR: deadlock detected
  Detail:
     Process 13560 waits for ShareLock on transaction 3147316424;
         blocked by process 13566.
     Process 13566 waits for ShareLock on transaction 3147316408;
         blocked by process 13560.

What the...?

How a deadlock happens

ids = [1, 2]
bool(
  MyModel.objects\
    .select_for_update()\
    .filter(id__in=ids)
)

Process 1

Process 2

id read only
1 YES
2 YES
id read only
1 YES
2 YES

Waiting for each other

ids = [2, 1]
bool(
  MyModel.objects\
    .select_for_update()\
    .filter(id__in=ids)
)

Preventing deadlocks

ids = [1, 2]
bool(
  MyModel.objects\
    .select_for_update()\
    .filter(id__in=ids).
    .order_by('id')
)

Process 1

Process 2

ids = [2, 1]
bool(
  MyModel.objects\
    .select_for_update()\
    .filter(id__in=ids).
    .order_by('id')
)

Solution: when using select_for_updates on multiple records, make sure you acquire the locks in a consistent order.

Pitfall 3 - Testing

 

Our tests are wrapped in transactions by default.

 

Why might this be a problem?

 

Does not wrap your test in a transaction.  Slower, but better for code where you need to test behaviour relating to transactions.

@requires_db(transaction=True)

 

Unwrapped SELECT FOR UPDATEs will pass tests
but error in production!

Summary

 

  • Database concurrency is something you need to think about.
     
  • Make sure you know what isolation level your database is using, and how concurrent reading and writing is handled.
     
  • Select for update makes a read query behave like a write.
     
  • Pessimistic locking is a simple way to make your code wait until it's safe.

David Seddon   http://seddonym.me

These slides:

https://slides.com/davidseddon/managing-database-concurrency-with-django

Appendix

ATOMIC_REQUESTS
# settings.py

ATOMIC_REQUESTS = True

The request/response cycle will be wrapped in a database transaction.

 

If an exception is raised, the
transaction is rolled back.

Savepoints

Savepoints allow you to roll back
within transactions

Query

Transaction

Query

> Savepoint

Query

Query

> Rollback

Nested atomic blocks

Transaction

Query

> Savepoint

Query

Query

> Savepoint

Query

> Savepoint

Query

Query

with transaction.atomic():
    foo()
    
    with transaction.atomic():
       
        bar()
    





    with transaction.atomic():
       
        baz()


        with transaction.atomic():

            foobar()

Nested atomic blocks

with transaction.atomic():
    
    foo()  - Will be committed
    
    try:
        with transaction.atomic():
            bar()  - Will be rolled back
            raise Exception
    except:
        pass

    baz()  - Will be committed

Exceptions raised within an atomic block will roll back that atomic block.

Tech Talk: How to do two things at once

By David Seddon

Tech Talk: How to do two things at once

Two key concepts you can't afford to ignore

  • 600