Caching

Text

To cache something is to save the result of an expensive calculation so that you don’t have to perform the calculation next time.

What ?

Text

Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests can be served from the cache, the faster the system performs.

Why ?

Text

given a URL, try finding that page in the cache
if the page is in the cache:
    return the cached page
else:
    generate the page
    save the generated page in the cache (for next time)
    return the generated page

How ?

Caching is Everywhere

  • cpu
  • database
  • http server
  • browser

HTTP caching headers

  • Cache-Control
  • Expires
  • Vary
  • ETag

Caching on the web framework level

Examples of caching in Django

Examples of caching in Django

>>> entry = Entry.objects.get(id=1)
>>> entry.blog   # Blog object is retrieved at this point
>>> entry.blog   # cached version, no DB access
>>> entry = Entry.objects.get(id=1)
>>> entry.authors.all()   # query performed
>>> entry.authors.all()   # query performed again

Examples of caching in Django

def get_user(request):
    if not hasattr(request, '_cached_user'):
        request._cached_user = auth.get_user(request)
    return request._cached_user


class AuthenticationMiddleware(object):
    def process_request(self, request):
        assert hasattr(request, 'session'), (
            "The Django authentication middleware requires session middleware "
            "to be installed. Edit your MIDDLEWARE_CLASSES setting to insert "
            "'django.contrib.sessions.middleware.SessionMiddleware' before "
            "'django.contrib.auth.middleware.AuthenticationMiddleware'."
        )
        request.user = SimpleLazyObject(lambda: get_user(request))

@cached_property

# the model
class Person(models.Model):

    def friends(self):
        # expensive computation
        ...
        return friends

# in the view:
if person.friends():
    ...

# in the template
{% for friend in person.friends %}

# 2x calculations

@cached_property

from django.utils.functional import cached_property

@cached_property
def friends(self):
    # expensive computation
    ...
    return friends

# in the view:
if person.friends():
    ...

# in the template
{% for friend in person.friends %}

# 1x calculation


# del obj.friends

caching static files

caching static files

caching static files

  1. Set expiry headers
  2. Use ManifestStaticFilesStorage or  CachedStaticFilesStorage

STATICFILES_STORAGE=django.contrib.staticfiles.storage.ManifestStaticFilesStorage

 3. collectstatic

 

css/styles.css - > css/styles.55e7cbb9ba48.css

Django’s cache framework

Backends

Text

CACHES = {
    'default': {
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
        'LOCATION': '127.0.0.1:11211',
        'TIMEOUT': 60,
        'OPTIONS': {
            'MAX_ENTRIES': 1000
        }
    }
}

Backends

Text

  • Memcached
  • Database caching
  • Filestystem caching
  • Local-memory caching
  • Dummy caching
  • Custom backends (django-redis-cache)

Per site cache

MIDDLEWARE = [
    'django.middleware.cache.UpdateCacheMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
]

FetchFromCacheMiddleware caches GET and HEAD responses with status 200, where the request and response headers allow. Responses to requests for the same URL with different query parameters are considered to be unique pages and are cached separately.

Per view cache

Text

from django.views.decorators.cache import cache_page

@cache_page(60 * 15)
def my_view(request):
    ...


urlpatterns = [
    url(r'^foo/([0-9]{1,2})/$', my_view),
]

The per-view cache, like the per-site cache, is keyed off of the URL. If multiple URLs point at the same view, each URL will be cached separately.

Vary header

Text

from django.views.decorators.cache import cache_page
from django.views.decorators.vary import vary_on_headers


@vary_on_headers('User-Agent',)
def my_view(request):
    ...

@cache_page(60*60)
@vary_on_cookie
def expensive_func(request):
    ...


The Vary header defines which request headers a cache mechanism should take into account when building its cache key.

Template fragment caching

{% load cache %}
{% cache 500 sidebar %}
    .. sidebar ..
{% endcache %}


{% load cache %}
{% cache 500 sidebar request.user.username %}
    .. sidebar for logged in user ..
{% endcache %}

Low level cache API

 from django.core.cache import cache # This object is equivalent to caches['default'].

>>> cache.set('my_key', 'hello, world!', 30)
>>> cache.get('my_key')
'hello, world!'

>>> cache.set('add_key', 'Initial value')
>>> cache.add('add_key', 'New value')
>>> cache.get('add_key')
'Initial value'

>>> cache.get('my_new_key')  # returns None
>>> cache.get_or_set('my_new_key', 'my new value', 100)
'my new value'

key should be a str (or unicode on Python 2), and value can be any picklable Python object.

Low level cache API

>>> cache.set('a', 1)
>>> cache.set('b', 2)
>>> cache.set('c', 3)
>>> cache.get_many(['a', 'b', 'c'])
{'a': 1, 'b': 2, 'c': 3}


>>> cache.delete('a')

>>> cache.clear()

>>> cache.set('num', 1)
>>> cache.incr('num')
2
>>> cache.incr('num', 10)
12
>>> cache.decr('num')
11
>>> cache.decr('num', 5)
6

Controlling cache

from django.views.decorators.cache import never_cache

@never_cache
def myview(request):
    ...
from django.views.decorators.cache import cache_control

@cache_control(private=True)
def my_view(request):
    ...
django.utils.cache

third party packages

django-cachalot

Django-cachalot is the perfect speedup tool for most Django projects. It will speedup a website of 100 000 visits per month without any problem. In fact, the more visitors you have, the faster the website becomes. That’s because every possible SQL query on the project ends up being cached.

However, it’s not suited for projects where there is a high number of modifications per minute on each table, like a social network with more than a 50 messages per minute. Django-cachalot may still give a small speedup in such cases, but it may also slow things a bit (in the worst case scenario, a 20% slowdown). If you have a website like that, optimising your SQL database and queries is the number one thing you have to do.

django-cachalot

  • Saves in cache the results of any SQL query generated by the Django ORM that reads data. These saved results are then returned instead of executing the same SQL query, which is faster.
  • The first time a query is executed is about 10% slower, then the following times are way faster (7× faster being the average).
  • Automatically invalidates saved results, so that you never get stale results.
  • Invalidates per table, not per object: if you change an object, all the queries done on other objects of the same model are also invalidated. This is unfortunately technically impossible to make a reliable per-object cache. Don’t be fooled by packages pretending having that per-object feature, they are unreliable and dangerous for your data.
  • Handles everything in the ORM. You can use the most advanced features from the ORM without a single issue, django-cachalot is extremely robust.

django-cacheops

A slick app that supports automatic or manual queryset caching and automatic granular event-driven invalidation.

It uses redis as backend for ORM cache and redis or filesystem for simple time-invalidated one.

And there is more to it:

  • decorators to cache any user function or view as a queryset or by time
  • extensions for django and jinja2 templates to cache template fragments as querysets or by time
  • transparent transaction support
  • dog-pile prevention mechanism
  • a couple of hacks to make django faster

django-cacheops

CACHEOPS = {
    # Automatically cache any User.objects.get() calls for 15 minutes
    # This includes request.user or post.author access,
    # where Post.author is a foreign key to auth.User
    'auth.user': {'ops': 'get', 'timeout': 60*15},

    # Automatically cache all gets and queryset fetches
    # to other django.contrib.auth models for an hour
    'auth.*': {'ops': ('fetch', 'get'), 'timeout': 60*60},

    # Cache gets, fetches, counts and exists to Permission
    # 'all' is just an alias for ('get', 'fetch', 'count', 'exists')
    'auth.permission': {'ops': 'all', 'timeout': 60*60},

    # Enable manual caching on all other models with default timeout of an hour
    # Use Post.objects.cache().get(...)
    #  or Tags.objects.filter(...).order_by(...).cache()
    # to cache particular ORM request.
    # Invalidation is still automatic
    '*.*': {'ops': (), 'timeout': 60*60},

    # And since ops is empty by default you can rewrite last line as:
    '*.*': {'timeout': 60*60},
}

django-cacheops

  1. Conditions other than __exact, __in and __isnull=True don't make invalidation more granular.
  2. Conditions on TextFields, FileFields and BinaryFields don't make it either. One should not test on their equality anyway.
  3. Update of "selected_related" object does not invalidate cache for queryset. Use .prefetch_related() instead.
  4. Mass updates don't trigger invalidation by default. But see .invalidated_update().
  5. Sliced queries are invalidated as non-sliced ones.
  6. Doesn't work with .raw() and other sql queries.
  7. Conditions on subqueries don't affect invalidation.
  8. Doesn't work right with multi-table inheritance.
  9. Aggregates are not implemented yet.

caveats

Text

Django  + caching

By zqzak

Django  + caching

  • 837