Annotation & Aggregates in Django

Our aggregation project:

  • Show the number of stars our team has given out
  • Show the number of GitHub repos our team have starred
  • Show the top repositories of all time
  • Show GitHub repositories that our team have starred recently
  • Show the top repositories for the month
  • Show which users have the most stars (it's probably me!)

Getting the stars

  • Use a management command to hit the API
    • Don't forget to use:
      'Accept': 'application/vnd.github.v3.star+json'
       

  • Store the Stars in a model
  • Store the raw data, don't aggregate anything on the model

models.py

from django.db import models


class Star(models.Model):
    repo = models.CharField(max_length=200)
    starred_by = models.CharField(max_length=50)
    starred_at = models.DateTimeField(blank=True, null=True)
    organisation = models.CharField(max_length=50)

    class Meta:
        unique_together = ['repo', 'starred_by', 'organisation']

Show the number of stars our team has given out

 

Use the count() function on our queryset.


from django.shortcuts import render

from .models import Star


def starred(request):
    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
    }

Show the number of GitHub repos our team have starred

 

count() works just fine with distinct (if you're using Postgres)


from django.shortcuts import render

from .models import Star


def starred(request):
    unique_repos = Star.objects.distinct('repo').count()

    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
        'unique_repos': unique_repos,
    }

Show the number of GitHub repos our team have starred

 

(our hacky non-Postgres version, don't do this)


from django.shortcuts import render

from .models import Star


def starred(request):
    unique_repos = len(set(
       Star.objects.values_list('repo', flat=True)
    ))

    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
        'unique_repos': unique_repos,
    }

Show the top repositories of all time

 

Introducting Count() (note the uppercase C)


from django.db.models import Count
from django.shortcuts import render

from .models import Star


def starred(request):
    unique_repos = len(set(
       Star.objects.values_list('repo', flat=True)
    ))

    # Get the repos and the number of stars by our team
    popular_repos = Star.objects.values('repo').annotate(Count('repo'))

    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
        'popular_repos': popular_repos,
        'unique_repos': unique_repos,
    }

from django.db.models import Count

  • Count() appends an aggregate to each item in our QuerySet
  • It takes a single argument - the name of the field to count
  • Add it to your QuerySet with the annotate() clause
    • Annotate returns a QuerySet, which comes in very handy

Show the top repositories of all time
(part two)

 

Yep - you can order_by the count super-easily.


from django.db.models import Count
from django.shortcuts import render

from .models import Star


def starred(request):
    unique_repos = len(set(
       Star.objects.values_list('repo', flat=True)
    ))

    # Get the repos with the most stars by our team
    popular_repos = Star.objects.values('repo').annotate(Count('repo'))
    popular_repos = popular_repos.order_by('-repo__count')

    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
        'popular_repos': popular_repos,
        'unique_repos': unique_repos,
    }

Show GitHub repositories that our team have starred recently

 

Filtering works as well!


from datetime import datetime, timedelta

from django.db.models import Count
from django.shortcuts import render

from .models import Star


def starred(request):
    unique_repos = len(set(
       Star.objects.values_list('repo', flat=True)
    ))

    # Get the repos with the most stars by our team
    popular_repos = Star.objects.values('repo').annotate(Count('repo'))
    popular_repos = popular_repos.order_by('-repo__count')
    popular_repos = popular_repos.filter(
        starred_at__gte=datetime.today() - timedelta(days=7)
    )

    return render(request, 'star_counts.html', {
        'star_count': Star.objects.count(),
        'popular_repos': popular_repos,
        'unique_repos': unique_repos,
    }

The end product

Aggregation in Django

By Brenton Cleeland

Aggregation in Django

  • 2,265