ORM
Caveats, Tips, Optimialization
Getting the SQL behind a Queryset
queryset = Blog.objects.all()
print(queryset.query)
SELECT "blog_entry"."id", "blog_entry"."title",
"blog_entry"."body", "blog_entry"."slug", "blog_entry"."publish",
"blog_entry"."created", "blog_entry"."modified",
"blog_entry"."featured_image" FROM "blog_entry" ORDER BY "blog_entry"."created" DESC
Getting the SQL behind a Queryset
queryset = Blog.objects.all()
print(queryset.query)
SELECT "blog_entry"."id", "blog_entry"."title",
"blog_entry"."body", "blog_entry"."slug", "blog_entry"."publish",
"blog_entry"."created", "blog_entry"."modified",
"blog_entry"."featured_image" FROM "blog_entry" ORDER BY "blog_entry"."created" DESC
Getting the SQL behind a Queryset
from django.db import connection
connections.queries # only with DEBUG=TRUE
[{'sql': 'SELECT polls_polls.id, polls_polls.question,
polls_polls.pub_date FROM polls_polls',
'time': '0.002'}]
Managers
class AuthorManager(models.Manager):
def get_queryset(self):
return super(AuthorManager, self).get_queryset().filter(role='A')
class EditorManager(models.Manager):
def get_queryset(self):
return super(EditorManager, self).get_queryset().filter(role='E')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
role = models.CharField(max_length=1, choices=(('A', _('Author')), ('E', _('Editor'))))
people = models.Manager()
authors = AuthorManager()
editors = EditorManager()
Person.authors.all() # vs Person.filter(role='A')
Person.editors.all()
Person.people.all()
ok
Managers
class AuthorManager(models.Manager):
def get_queryset(self):
return super(AuthorManager, self).get_queryset().filter(role='A')
class EditorManager(models.Manager):
def get_queryset(self):
return super(EditorManager, self).get_queryset().filter(role='E')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
role = models.CharField(max_length=1, choices=(('A', _('Author')), ('E', _('Editor'))))
people = models.Manager()
authors = AuthorManager()
editors = EditorManager()
Person.authors.all() # vs Person.filter(role='A')
Person.editors.all()
Person.people.all()
Person.people.filter(last_name__startswith='A').authors() # Exception
problem (cannot chain)
Managers
class PersonQuerySet(models.QuerySet):
def authors(self):
return self.filter(role='A')
def editors(self):
return self.filter(role='E')
class PersonManager(models.Manager):
def get_queryset(self):
return PersonQuerySet(self.model, using=self._db)
def authors(self):
return self.get_queryset().authors()
def editors(self):
return self.get_queryset().editors()
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
role = models.CharField(max_length=1, choices=(('A', _('Author')), ('E', _('Editor'))))
people = PersonManager()
Person.people.authos()
Person.people.editors()
Person.people.filter(last_name__startswith='A').authors() # OK
better (but verbose)
Managers
class PersonQuerySet(models.QuerySet):
def authors(self):
return self.filter(role='A')
def editors(self):
return self.filter(role='E')
class Person(models.Model):
first_name = models.CharField(max_length=50)
last_name = models.CharField(max_length=50)
role = models.CharField(max_length=1, choices=(('A', _('Author')), ('E', _('Editor'))))
people = PersonQuerySet.as_manager()
Person.people.authos()
Person.people.editors()
Person.people.filter(last_name__startswith='A').authors() # OK
best
Managers
- Adding extra Manager methods is the preferred way to add “table-level” functionality to your models. (For “row-level” functionality – i.e., functions that act on a single instance of a model object – use Model methods, not custom Manager methods.)
- A custom Manager method can return anything you want. It doesn’t have to return a QuerySet (perform a complex aggregation, hide a raw sql)
Overriding Base Managers
Don’t filter away any results in this type of manager subclass.
If you override the get_queryset() method and filter out any rows, Django will return incorrect results. Don’t do that. A manager that filters results in get_queryset() is not appropriate for use as a default manager.
Overriding Base Managers
Scenario
For a Post model an inactive flag is requested.
The development is an advanced stage.
The inactive posts won't be involved in 99% of cases.
Overriding Base Managers
Scenario
For a Post model an inactive flag is requested.
The development is an advanced stage.
The inactive posts won't participate in 99% of cases.
posts_to_publish = Post.objects.filter(...)
for post in posts_to_publish:
post.publish()
posts_to_publish = Post.objects.active().filter(...)
for post in posts_to_publish:
post.publish()
Overriding Base Managers
Scenario
For a Post model an inactive flag is requested.
The development is an advanced stage.
The inactive posts won't participate in 99% of cases.
posts_to_publish = Post.objects.filter(...)
for post in posts_to_publish:
post.publish()
posts_to_publish = Post.objects.active().filter(...)
for post in posts_to_publish:
post.publish()
What if we need to do that 100 times ?
Overriding Base Managers
Solution : Replace the custom Manager...
class ActiveBlogManager(models.Manager):
def get_queryset(self):
return super(ActiveBlogManager, self).get_queryset().\
filter(status='ACTIVE')
class Blog(models.Model):
...
objects = ActiveBlogManager()
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Create a second manager which returns inactive Post
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Create a second manager which returns inactive Post
Django Admin uses the default queryset (won't list inactive Posts)
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Create a second manager which returns inactive Post
Django Admin uses the default queryset (won't list inactive Posts)
Override the get_queryset Admin model to include all Posts
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Create a second manager which returns inactive Post
Django Admin uses the default queryset (won't list inactive Posts)
Override the get_queryset Admin model to include all Posts
The reverse manager uses the default manager ( blog.post_set.all() )
Overriding Base Managers
Problems (and solutions):
No way to access inactive Posts!
Create a second manager which returns inactive Post
Django Admin uses the default queryset (won't list inactive Posts)
Override the get_queryset Admin model to include all Posts
The reverse manager uses the default manager ( blog.post_set.all() )
blog.post_set(manager='inactive').all()
Tips
Use fk keys directly
entry.blog_id
# vs
entry.blog.id
Tips
The with templatetag
{% with total=business.employees.count %}
{% if total < 10 %}
<p>Only {{total}} employees left! Hire more!</p>
{% endif %}
{% endwith %}
Tips
Don't retrieve things you don't need
Blog.objects.all()
#vs
Blog.objects.values('id', 'name')
<QuerySet [{'id': 1, 'name': 'My Blog'}]>
Person.objects.only("name")
Don't retrieve things you don't need (assume a model with hundreds of columns)
Tips
Take advantage of query expressions and database functions
len(Entry.objects.all())
#vs
Entry.objects.count()
name = name.lower()
Blog.objects.get(name=name)
#vs
Blog.objects.get(name__iexact='beatles blog')
for author in Author.objects.iterator():
if author.alias:
name = author.alias:
if author.goes_by:
name = name
name = author.name
setattr(author,'screen_name', name)
#vs
author = Author.objects.annotate(
screen_name=Coalesce('alias', 'goes_by', 'name'))
Tips
Take advantage of query expressions and database functions
# getting all unique values for a model
# don't do it in Python...
class Transaction(models.Model):
currency = models.CharField(u'Waluta', max_length=3, validators=[CurrencyValidator])
Tips
Take advantage of query expressions and database functions
# getting all unique values for a model
# don't do it in Python...
Transaction.objects.values('currency')
[{'currency': u'PLN'}, {'currency': u'PLN'}, {'currency': u'USD'}, {'currency': u'USD'},
{'currency': u'HUF'}]
Transaction.objects.filter().order_by().values('currency').distinct()
[{'currency': u'PLN'}, {'currency': u'USD'}, {'currency': u'HUF'}]
Transaction.objects.filter().order_by().values_list('currency').distinct()
[(u'HUF',), (u'USD',), (u'GBP',), (u'PLN',), (u'CHF',), (u'EUR',)]
Transaction.objects.filter().order_by().values_list('currency', flat=True).distinct()
[u'HUF', u'USD', u'GBP', u'PLN', u'CHF', u'EUR']
Tips
Take advantage of query expressions and database functions
#referencing fields in the same model
#lectures_with_not_enough_seats
Lecture.objects.filter(room__seats_lt=F('attendees')
#avoiding race conditions
post = Post.objects.get(id=5434)
post.likes += 1
post.save()
from django.db.models import F
post = Post.objects.get(id=5434)
post.likes = F('likes') + 1
post.save()
Various caveats
Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
# vs
Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
Spanning multi-valued relationships
Various caveats
Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)
# vs
Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)
Spanning multi-valued relationships
Suppose there is only one blog that had both entries containing “Lennon” and entries from 2008, but that none of the entries from 2008 contained “Lennon”. The first query would not return any blogs, but the second query would return that one blog.
Various caveats
queryset = Entry.objects.all()
print queryset[5]
print queryset[5]
Lazy QuerySet evaluation
Various caveats
queryset = Entry.objects.all()
print queryset[5] # Queries the database
print queryset[5] # Queries the database
Lazy QuerySet evaluation
Various caveats
queryset = Entry.objects.all()
[entry for entry in queryset]
print queryset[5]
print queryset[5]
Lazy QuerySet evaluation
Various caveats
queryset = Entry.objects.all()
[entry for entry in queryset] # Queries the database
print queryset[5] # Uses cache
print queryset[5] # Uses cache
Lazy QuerySet evaluation
Various caveats
# Average price across all books.
>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': 34.35}
'sql': u'SELECT AVG("app_book"."price") AS "price__avg" FROM "app_book"'
# Each publisher, each with a count of books as a "num_books" attribute.
>>> from django.db.models import Count
>>> pubs = Publisher.objects.annotate(num_books=Count('book'))
>>> pubs
<QuerySet [<Publisher: BaloneyPress>, <Publisher: SalamiPress>, ...]>
>>> pubs[0].num_books
'SELECT *, COUNT("app_publisher"."book")
AS "num_books"
FROM "app_publisher"
GROUP BY "app_publisher"."id"
Annotate(returns a QuerySet)
Aggregate(returns a dict)
Various caveats
# Average price across all books.
>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': 34.35}
'sql': u'SELECT AVG("app_book"."price") AS "price__avg" FROM "app_book"'
# Each publisher, each with a count of books as a "num_books" attribute.
>>> from django.db.models import Count
>>> pubs = Publisher.objects.annotate(num_books=Count('book'))
>>> pubs
<QuerySet [<Publisher: BaloneyPress>, <Publisher: SalamiPress>, ...]>
>>> pubs[0].num_books
'SELECT *, COUNT("app_publisher"."book")
AS "num_books"
FROM "app_publisher"
GROUP BY "app_publisher"."id"
Annotate(returns a QuerySet)
Aggregate(returns a dict)
Various caveats
>>> a, b = Publisher.objects.annotate(avg_rating=Avg('book__rating')).\
filter(book__rating__gt=3.0)
>>> a, b = Publisher.objects.filter(book__rating__gt=3.0).\
annotate(avg_rating=Avg('book__rating'))
The first query asks for the average rating of all a publisher’s books for publisher’s that
have at least one book with a rating exceeding 3.0.
The second query asks for the average of a publisher’s book’s ratings for only
those ratings exceeding 3.0.
The order of annotate() and filter() clauses matters
Various caveats
class Transaction(models.Model):
...
class Meta:
ordering = ["id"]
Transaction.objects.order_by().filter().values('currency').query
'SELECT "vault_transaction"."currency" FROM "vault_transaction'
Default Ordering
-
don't use where you dont' need it
- clear the default ordering
Other Topic
- prefetch_related, select_related, Prefetch
- Postgresql Full text search in Django
- PostgreSQL specific model fields (ArrayField, JSONField)
- Caching in Django
26.10.2016
orm
By zqzak
orm
- 378