Django ORM equivalent to group_concat

We are again trying to create a web tool to help us manage our joinery software . We use Django , which, despite its age, is still the most widely used (and most documented) web programming framework for Python. It's been a few years since I came up with something, and I stumble over a few things.

For example, my data model looks like this (with a lot of irrelevant stuff):

class Person(models.Model):
    '''Someone we know.'''
    email      = models.CharField(max_length=STR_LONG, unique=True, null=True)

class Event(models.Model):
    '''A workshop or other event.'''
    slug       = models.CharField(max_length=STR_LONG, unique=True)

class Role(models.Model):
    '''The kinds of things people can do at workshops.'''
    name       = models.CharField(max_length=STR_MED)

class Task(models.Model):
    '''Someone did something at some workshop.'''
    event      = models.ForeignKey(Event)
    person     = models.ForeignKey(Person)
    role       = models.ForeignKey(Role)

      

One of the application pages displays information about a specific event. I want to add the names of all the people who were instructors at this event to the page. If I were to use SQL directly, I would write something like:

select   Event.slug, group_contact(Person.email, ', ')
from     Person join Event join Role join Task
on       Person.id=Task.person and Event.id=Task.event and Role.id=Task.role
where    Role.name='instructor'
group by Event.id;

      

How can I do this with Django ORM? According to this Stack Overflow question , I can use the 'regroup' tag in the view or create a custom aggregator . The former is complicated by the multi-stage nature of the association, and the latter feels ... difficult. My instinct is that I should be able to attach to this event all the objects Person

corresponding to the instructors in a particular Event

one and then loop through them in my view. If you know how to do this, I would appreciate a pointer.

+3


source to share


5 answers


How about the following (Django 1.7+):

from django.db.models import Prefetch

prefetch = Prefetch(
    'task_set',
    queryset=Task.objects.filter(role__name='instructor')
    .select_related('person'), to_attr='instructor_tasks'
)
events = Event.objects.all().prefetch_related(prefetch)

for event in events:
    print event

    for task in event.instructor_tasks:
        print task.person.email

      

Using prefetch_related this way should prevent O (n) DB queries (despite a loop).



Django DB query log output for the aforementioned query / loop (with presumably a tiny data set of 2 events):

DEBUG django.db.backends: (0.001) SELECT "myapp_event"."id", "myapp_event"."slug" FROM "myapp_event"; args=()
DEBUG django.db.backends: (0.001) SELECT "myapp_task"."id", "myapp_task"."event_id", "myapp_task"."person_id", "myapp_task"."role_id", "myapp_person"."id", "myapp_person"."email" FROM "myapp_task" INNER JOIN "myapp_role" ON ( "myapp_task"."role_id" = "myapp_role"."id" ) INNER JOIN "myapp_person" ON ( "myapp_task"."person_id" = "myapp_person"."id" ) WHERE ("myapp_role"."name" = 'instructor' AND "myapp_task"."event_id" IN (1, 2)); args=('instructor', 1, 2)

      

Thus, this query code should continue to issue only two DB queries even as the number of events grows.

+2


source


I don't think you need such a complex query.

For any particular event, you can do this:



instructors = Task.objects.filter(event=my_event, role__name='instructor').select_related('person').values_list('person__email')

      

which will provide you with a list of emails for all persons related to the tasks for this event, with the role of instructor. Admittedly, this is one event request, not one massive request, but unless you plan on showing hundreds of events per page, the tradeoff in terms of clarity is probably worth it.

+1


source


In the upcoming Django 1.8, you can simply implement a GroupConcat expression and then the request looks like this:

Event.objects.values('slug').annotate(emails=GroupConcat('task__person__email'))

      

Combination of .values ​​(). annotate () sets the GROUP BY to the slug, and of course the GroupConcat implementation does the actual aggregation.

To write an implementation of the GroupConcat implementation, https://docs.djangoproject.com/en/dev/ref/models/expressions/#writing-your-own-query-expressions

+1


source


Since you are already proficient in mysql commands, try running the Raw mysql command from Django.

https://docs.djangoproject.com/en/dev/topics/db/sql/

0


source


The usual solution here is not to try to optimize and just do one search per event:

for event in Event.objects.filter(**some_lookup):    
    instructors = Person.objects.filter(
        task__role__name="instructor", 
        task__event=event
        ).values_list("name", flat=True)
    print "Event: %s - instructors : %s" % (event.slug, ", ".join(instructors))

      

If and when this becomes a performance bottleneck, it's time to find a better solution (which may or may not include falling back to raw sql).

0


source







All Articles