Django query for a lot of relationships
I have Django models installed like this:
model A has a one-to-many relationship to model B
each record in B has 3000 to 15000 records in B
What is the best way to build a query that will retrieve the newest (largest pk) record in B that matches the record in for each record in A? Is this something I should be using SQL instead of Django ORM?
source to share
Create a helper function to safely retrieve the 'top' element from any set of queries. I use all of this in my Django apps.
def top_or_none(queryset):
"""Safely pulls off the top element in a queryset"""
# Extracts a single element collection w/ top item
result = queryset[0:1]
# Return that element or None if there weren't any matches
return result[0] if result else None
This uses a little sla operator trick to add a constraint clause to your SQL .
Now use this function anywhere to get the "top" item in the queryset. In this case, you want to get the top B element for a given A, where B is sorted in descending order by pk, as such:
latest = top_or_none(B.objects.filter(a=my_a).order_by('-pk'))
Also recently added a "Max" function in Django Aggregation that can help you get max pk, but I don't like this solution in this case as it adds complexity.
PS I don't really like to rely on the "pk" field for this type of query, as some RDBMSs do not guarantee that sequential pks match the boolean creation order. If I have a table that I know I will need to query this way, I usually have my own datetime column to create which I can use to order instead of pk.
Edit based on comment:
If you prefer to use queryset [0], you can change the "top_or_none" function like this:
def top_or_none(queryset):
"""Safely pulls off the top element in a queryset"""
try:
return queryset[0]
except IndexError:
return None
I didn't suggest this initially because I was under the impression that queryset [0] would discard the entire result set and then take the 0th element. Apparently Django is adding "LIMIT 1" to this script, so it's a safe alternative to my sliced version.
Edit 2
Of course, you can also take advantage of the Django-related manager construct and build the request through your "A" object depending on your preference:
latest = top_or_none(my_a.b_set.order_by('-pk'))
source to share
I don't think Django ORM can do this (but I was pleasantly surprised before ...). If there are a reasonable number of A records (or if you are paging), I would just add a method to the model that will return that "newest" B record. If you want to get many A records, each with its own new B, I would go to SQL.
remember that no matter which route you take, you will need a suitable composite index on table B, perhaps by order_by=('a_fk','-id')
subclassingMeta
source to share