I'm trying to optimize some Django code, and I've got two similar approach that are performing differetly. Here are some example models:
class A(models.Model):
name = models.CharField(max_length=100)
class B(models.Model):
name = models.CharField(max_length=100)
a = models.ForeignKey(A)
c = models.ForeignKey(C)
class C(models.Model):
name = models.CharField(max_length=100)
For each A
object, I'd like to iterate over a subset of its incoming B
's, filtered on the their c
value. Simple:
for a in A.objects.all() :
for b in a.B_set.filter( c__name='some_val' ) :
print a.name, b.name, c.name
The problem with this is that there is a new database lookup for every a
value iterated over.
It seems that the solution is to prefetch the c values which will feed into the filter.
qs_A = A.objects.all().prefetch_related('B_set__c')
Now consider the following two filter approaches:
# Django filter
for a in qs_A :
for b in a.B_set.filter( c__name='some_val' ) :
print a.name, b.name, n.name
# Python filter
for a in qs_A :
for b in filter( lambda b: b.c.name == 'some_val', a.B_set.all() ):
print a.name, b.name, c.name
With the data I'm using, the django filter makes 48 more SQL queries than the python filter (on a 12-element qs_A
result set). This makes me believe that the django filter doesn't make use of the prefetched tables.
Could someone explain what is happened?
Perhaps it's possible to apply the filter during the prefetch?