Caching
Text
To cache something is to save the result of an expensive calculation so that you don’t have to perform the calculation next time.
What ?
Text
Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests can be served from the cache, the faster the system performs.
Why ?
Text
given a URL, try finding that page in the cache
if the page is in the cache:
return the cached page
else:
generate the page
save the generated page in the cache (for next time)
return the generated page
How ?
Caching is Everywhere
- cpu
- database
- http server
- browser
HTTP caching headers
- Cache-Control
- Expires
- Vary
- ETag
Caching on the web framework level
Examples of caching in Django
Examples of caching in Django
>>> entry = Entry.objects.get(id=1)
>>> entry.blog # Blog object is retrieved at this point
>>> entry.blog # cached version, no DB access
>>> entry = Entry.objects.get(id=1)
>>> entry.authors.all() # query performed
>>> entry.authors.all() # query performed again
Examples of caching in Django
def get_user(request):
if not hasattr(request, '_cached_user'):
request._cached_user = auth.get_user(request)
return request._cached_user
class AuthenticationMiddleware(object):
def process_request(self, request):
assert hasattr(request, 'session'), (
"The Django authentication middleware requires session middleware "
"to be installed. Edit your MIDDLEWARE_CLASSES setting to insert "
"'django.contrib.sessions.middleware.SessionMiddleware' before "
"'django.contrib.auth.middleware.AuthenticationMiddleware'."
)
request.user = SimpleLazyObject(lambda: get_user(request))
@cached_property
# the model
class Person(models.Model):
def friends(self):
# expensive computation
...
return friends
# in the view:
if person.friends():
...
# in the template
{% for friend in person.friends %}
# 2x calculations
@cached_property
from django.utils.functional import cached_property
@cached_property
def friends(self):
# expensive computation
...
return friends
# in the view:
if person.friends():
...
# in the template
{% for friend in person.friends %}
# 1x calculation
# del obj.friends
caching static files
caching static files
caching static files
- Set expiry headers
- Use ManifestStaticFilesStorage or CachedStaticFilesStorage
STATICFILES_STORAGE=django.contrib.staticfiles.storage.ManifestStaticFilesStorage
3. collectstatic
css/styles.css - > css/styles.55e7cbb9ba48.css
Django’s cache framework
Backends
Text
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
'LOCATION': '127.0.0.1:11211',
'TIMEOUT': 60,
'OPTIONS': {
'MAX_ENTRIES': 1000
}
}
}
Backends
Text
- Memcached
- Database caching
- Filestystem caching
- Local-memory caching
- Dummy caching
- Custom backends (django-redis-cache)
Per site cache
MIDDLEWARE = [
'django.middleware.cache.UpdateCacheMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.cache.FetchFromCacheMiddleware',
]
FetchFromCacheMiddleware caches GET and HEAD responses with status 200, where the request and response headers allow. Responses to requests for the same URL with different query parameters are considered to be unique pages and are cached separately.
Per view cache
Text
from django.views.decorators.cache import cache_page
@cache_page(60 * 15)
def my_view(request):
...
urlpatterns = [
url(r'^foo/([0-9]{1,2})/$', my_view),
]
The per-view cache, like the per-site cache, is keyed off of the URL. If multiple URLs point at the same view, each URL will be cached separately.
Vary header
Text
from django.views.decorators.cache import cache_page
from django.views.decorators.vary import vary_on_headers
@vary_on_headers('User-Agent',)
def my_view(request):
...
@cache_page(60*60)
@vary_on_cookie
def expensive_func(request):
...
The Vary header defines which request headers a cache mechanism should take into account when building its cache key.
Template fragment caching
{% load cache %}
{% cache 500 sidebar %}
.. sidebar ..
{% endcache %}
{% load cache %}
{% cache 500 sidebar request.user.username %}
.. sidebar for logged in user ..
{% endcache %}
Low level cache API
from django.core.cache import cache # This object is equivalent to caches['default'].
>>> cache.set('my_key', 'hello, world!', 30)
>>> cache.get('my_key')
'hello, world!'
>>> cache.set('add_key', 'Initial value')
>>> cache.add('add_key', 'New value')
>>> cache.get('add_key')
'Initial value'
>>> cache.get('my_new_key') # returns None
>>> cache.get_or_set('my_new_key', 'my new value', 100)
'my new value'
key should be a str (or unicode on Python 2), and value can be any picklable Python object.
Low level cache API
>>> cache.set('a', 1)
>>> cache.set('b', 2)
>>> cache.set('c', 3)
>>> cache.get_many(['a', 'b', 'c'])
{'a': 1, 'b': 2, 'c': 3}
>>> cache.delete('a')
>>> cache.clear()
>>> cache.set('num', 1)
>>> cache.incr('num')
2
>>> cache.incr('num', 10)
12
>>> cache.decr('num')
11
>>> cache.decr('num', 5)
6
Controlling cache
from django.views.decorators.cache import never_cache
@never_cache
def myview(request):
...
from django.views.decorators.cache import cache_control
@cache_control(private=True)
def my_view(request):
...
django.utils.cache
third party packages
django-cachalot
Django-cachalot is the perfect speedup tool for most Django projects. It will speedup a website of 100 000 visits per month without any problem. In fact, the more visitors you have, the faster the website becomes. That’s because every possible SQL query on the project ends up being cached.
However, it’s not suited for projects where there is a high number of modifications per minute on each table, like a social network with more than a 50 messages per minute. Django-cachalot may still give a small speedup in such cases, but it may also slow things a bit (in the worst case scenario, a 20% slowdown). If you have a website like that, optimising your SQL database and queries is the number one thing you have to do.
django-cachalot
- Saves in cache the results of any SQL query generated by the Django ORM that reads data. These saved results are then returned instead of executing the same SQL query, which is faster.
- The first time a query is executed is about 10% slower, then the following times are way faster (7× faster being the average).
- Automatically invalidates saved results, so that you never get stale results.
- Invalidates per table, not per object: if you change an object, all the queries done on other objects of the same model are also invalidated. This is unfortunately technically impossible to make a reliable per-object cache. Don’t be fooled by packages pretending having that per-object feature, they are unreliable and dangerous for your data.
- Handles everything in the ORM. You can use the most advanced features from the ORM without a single issue, django-cachalot is extremely robust.
django-cacheops
A slick app that supports automatic or manual queryset caching and automatic granular event-driven invalidation.
It uses redis as backend for ORM cache and redis or filesystem for simple time-invalidated one.
And there is more to it:
- decorators to cache any user function or view as a queryset or by time
- extensions for django and jinja2 templates to cache template fragments as querysets or by time
- transparent transaction support
- dog-pile prevention mechanism
- a couple of hacks to make django faster
django-cacheops
CACHEOPS = { # Automatically cache any User.objects.get() calls for 15 minutes # This includes request.user or post.author access, # where Post.author is a foreign key to auth.User 'auth.user': {'ops': 'get', 'timeout': 60*15}, # Automatically cache all gets and queryset fetches # to other django.contrib.auth models for an hour 'auth.*': {'ops': ('fetch', 'get'), 'timeout': 60*60}, # Cache gets, fetches, counts and exists to Permission # 'all' is just an alias for ('get', 'fetch', 'count', 'exists') 'auth.permission': {'ops': 'all', 'timeout': 60*60}, # Enable manual caching on all other models with default timeout of an hour # Use Post.objects.cache().get(...) # or Tags.objects.filter(...).order_by(...).cache() # to cache particular ORM request. # Invalidation is still automatic '*.*': {'ops': (), 'timeout': 60*60}, # And since ops is empty by default you can rewrite last line as: '*.*': {'timeout': 60*60}, }
django-cacheops
- Conditions other than __exact, __in and __isnull=True don't make invalidation more granular.
- Conditions on TextFields, FileFields and BinaryFields don't make it either. One should not test on their equality anyway.
- Update of "selected_related" object does not invalidate cache for queryset. Use .prefetch_related() instead.
- Mass updates don't trigger invalidation by default. But see .invalidated_update().
- Sliced queries are invalidated as non-sliced ones.
- Doesn't work with .raw() and other sql queries.
- Conditions on subqueries don't affect invalidation.
- Doesn't work right with multi-table inheritance.
- Aggregates are not implemented yet.
caveats
Text
Django + caching
By zqzak
Django + caching
- 837