1

I'm trying to optimize some Django code, and I've got two similar approach that are performing differetly. Here are some example models:

class A(models.Model):
    name = models.CharField(max_length=100)

class B(models.Model):
    name = models.CharField(max_length=100)
    a = models.ForeignKey(A)
    c = models.ForeignKey(C)

class C(models.Model):
    name = models.CharField(max_length=100)

For each A object, I'd like to iterate over a subset of its incoming B's, filtered on the their c value. Simple:

for a in A.objects.all() :
    for b in a.B_set.filter( c__name='some_val' ) :
        print a.name, b.name, c.name

The problem with this is that there is a new database lookup for every a value iterated over.

It seems that the solution is to prefetch the c values which will feed into the filter.

qs_A = A.objects.all().prefetch_related('B_set__c')

Now consider the following two filter approaches:

# Django filter
for a in qs_A :
    for b in a.B_set.filter( c__name='some_val' ) :
        print a.name, b.name, n.name

# Python filter
for a in qs_A :
    for b in filter( lambda b: b.c.name == 'some_val', a.B_set.all() ):
        print a.name, b.name, c.name

With the data I'm using, the django filter makes 48 more SQL queries than the python filter (on a 12-element qs_A result set). This makes me believe that the django filter doesn't make use of the prefetched tables.

Could someone explain what is happened?

Perhaps it's possible to apply the filter during the prefetch?

4

1 回答 1

1

预取和过滤没有任何直接联系......过滤总是发生在您的数据库中,而prefetch_related的主要目的是在输出相关对象或类似的东西时获取相关对象的数据。

较小的 SQL 查询通常更好,但如果您想优化您的用例,您应该执行一些基准测试和分析,而不是依赖一些通用语句!

如果您一开始不使用,您可能可以使您的示例更有效,A而是使用B

qs = B.objects.select_related('a', 'c').filter(c__name='some val')
# maybe you need some filtering for a as well:
# qs = qs.filter(a__x=....)
for b in qs:
    print b.a.name, b.name, b.c.name

也许您需要在过滤后(在python中)进行一些重新分组/排序,但如果您已经可以一步执行所有过滤操作,它会更有效......否则也许看看原始sql查询......

于 2013-05-25T22:35:57.693 回答