13

I must be really misunderstanding something with the GenericRelation field from Django's content types framework.

To create a minimal self contained example, I will use the polls example app from the tutorial. Add a generic foreign key field into the Choice model, and make a new Thing model:

class Choice(models.Model):
    ...
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    thing = GenericForeignKey('content_type', 'object_id')

class Thing(models.Model):
    choices = GenericRelation(Choice, related_query_name='things')

With a clean db, synced up tables, and create a few instances:

>>> poll = Poll.objects.create(question='the question', pk=123)
>>> thing = Thing.objects.create(pk=456)
>>> choice = Choice.objects.create(choice_text='the choice', pk=789, poll=poll, thing=thing)
>>> choice.thing.pk
456
>>> thing.choices.get().pk
789

So far so good - the relation works in both directions from an instance. But from a queryset, the reverse relation is very weird:

>>> Choice.objects.values_list('things', flat=1)
[456]
>>> Thing.objects.values_list('choices', flat=1)
[456]

Why the inverse relation gives me again the id from the thing? I expected instead the primary key of the choice, equivalent to the following result:

>>> Thing.objects.values_list('choices__pk', flat=1)
[789]

Those ORM queries generate SQL like this:

>>> print Thing.objects.values_list('choices__pk', flat=1).query
SELECT "polls_choice"."id" FROM "polls_thing" LEFT OUTER JOIN "polls_choice" ON ( "polls_thing"."id" = "polls_choice"."object_id" AND ("polls_choice"."content_type_id" = 10))
>>> print Thing.objects.values_list('choices', flat=1).query
SELECT "polls_choice"."object_id" FROM "polls_thing" LEFT OUTER JOIN "polls_choice" ON ( "polls_thing"."id" = "polls_choice"."object_id" AND ("polls_choice"."content_type_id" = 10))

The Django docs are generally excellent, but I can't understand why the second query or find any documentation of that behaviour - it seems to return data from the wrong table completely?

4

2 回答 2

8

TL;DR这是 Django 1.7 中的一个错误,已在 Django 1.8 中修复。

更改直接转到 master 并且没有进入弃用期,这并不奇怪,因为在这里保持向后兼容性真的很困难。更令人惊讶的是,在1.8 发行说明中没有提及该问题,因为该修复更改​​了当前工作代码的行为。

这个答案的其余部分是对我如何使用git bisect run. 它在这里供我自己参考,所以如果我需要再次平分一个大型项目,我可以回到这里。


首先,我们设置了一个 django 克隆和一个测试项目来重现该问题。我在这里使用了virtualenvwrapper,但是您可以根据需要进行隔离。

cd /tmp
git clone https://github.com/django/django.git
cd django
git checkout tags/1.7
mkvirtualenv djbisect
export PYTHONPATH=/tmp/django  # get django clone into sys.path
python ./django/bin/django-admin.py startproject djbisect
export PYTHONPATH=$PYTHONPATH:/tmp/django/djbisect  # test project into sys.path
export DJANGO_SETTINGS_MODULE=djbisect.mysettings

创建以下文件:

# /tmp/django/djbisect/djbisect/models.py
from django.db import models
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes.fields import GenericForeignKey, GenericRelation

class GFKmodel(models.Model):
    content_type = models.ForeignKey(ContentType)
    object_id = models.PositiveIntegerField()
    gfk = GenericForeignKey()

class GRmodel(models.Model):
    related_gfk = GenericRelation(GFKmodel)

还有这个:

# /tmp/django/djbisect/djbisect/mysettings.py
from djbisect.settings import *
INSTALLED_APPS += ('djbisect',)

现在我们有一个工作项目,创建test_script.py要使用的项目git bisect run

#!/usr/bin/env python
import subprocess, os, sys

db_fname = '/tmp/django/djbisect/db.sqlite3'
if os.path.exists(db_fname):
    os.unlink(db_fname)

cmd = 'python /tmp/django/djbisect/manage.py migrate --noinput'
subprocess.check_call(cmd.split())

import django
django.setup()

from django.contrib.contenttypes.models import ContentType
from djbisect.models import GFKmodel, GRmodel

ct = ContentType.objects.get_for_model(GRmodel)
y = GRmodel.objects.create(pk=456)
x = GFKmodel.objects.create(pk=789, content_type=ct, object_id=y.pk)

query1 = GRmodel.objects.values_list('related_gfk', flat=1)
query2 = GRmodel.objects.values_list('related_gfk__pk', flat=1)

print(query1)
print(query2)

print(query1.query)
print(query2.query)

if query1[0] == 789 == query2[0]:
    print('FIXED')
    sys.exit(1)
else:
    print('UNFIXED')
    sys.exit(0)

该脚本必须是可执行的,因此请添加带有chmod +x test_script.py. 它应该位于 Django 被克隆到的目录中,即/tmp/django/test_script.py对我来说。这是因为import django应该首先选择本地签出的 django 项目,而不是站点包中的任何版本。

git bisect 的用户界面旨在找出错误出现的位置,因此当您试图找出某个错误何时修复时,通常的“坏”和“好”前缀是向后。这可能看起来有点颠倒,但如果错误存在,测试脚本应该成功退出(返回代码 0),如果错误被修复,它应该失败(返回代码非零)。这让我绊倒了几次!

git bisect start --term-new=fixed --term-old=unfixed
git bisect fixed tags/1.8
git bisect unfixed tags/1.7
git bisect run ./test_script.py

因此,此过程将进行自动搜索,最终找到修复错误的提交。这需要一些时间,因为 Django 1.7 和 Django 1.8 之间有很多提交。它平分了 1362 次修订,大约 10 步,最终输出:

1c5cbf5e5d5b350f4df4aca6431d46c767d3785a is the first fixed commit
commit 1c5cbf5e5d5b350f4df4aca6431d46c767d3785a
Author: Anssi Kääriäinen <akaariai@gmail.com>
Date:   Wed Dec 17 09:47:58 2014 +0200

    Fixed #24002 -- GenericRelation filtering targets related model's pk

    Previously Publisher.objects.filter(book=val) would target
    book.object_id if book is a GenericRelation. This is inconsistent to
    filtering over reverse foreign key relations, where the target is the
    related model's primary key.

这正是查询从不正确的 SQL 更改的提交(从错误的表中获取数据)

SELECT "djbisect_gfkmodel"."object_id" FROM "djbisect_grmodel" LEFT OUTER JOIN "djbisect_gfkmodel" ON ( "djbisect_grmodel"."id" = "djbisect_gfkmodel"."object_id" AND ("djbisect_gfkmodel"."content_type_id" = 8) )

进入正确的版本:

SELECT "djbisect_gfkmodel"."id" FROM "djbisect_grmodel" LEFT OUTER JOIN "djbisect_gfkmodel" ON ( "djbisect_grmodel"."id" = "djbisect_gfkmodel"."object_id" AND ("djbisect_gfkmodel"."content_type_id" = 8) )

当然,从提交哈希中,我们可以在 github 上轻松找到拉取请求和票证。希望有一天这也能对其他人有所帮助 - 由于迁移,将 Django 一分为二可能很难设置!

于 2016-10-09T16:25:25.027 回答
2

评论 - 回答太晚了 - 大多数已删除

问题#24002的向后不兼容修复的一个不重要的结果是 GenericRelatedObjectManager(例如things)长时间停止工作查询集,它只能用于过滤器等。

>>> choice.things.all()
TypeError: unhashable type: 'GenericRelatedObjectManager'
# originally before 1c5cbf5e5:  [<Thing: Thing object>]

半年后在 1.8.3 版本和 master 分支中由#24940 修复。这个问题并不重要,因为通用名称在thing没有查询 (choice.thing) 的情况下更容易工作,并且不清楚这种用法是记录还是未记录。

文档:反向通用关系

设置related_query_name会创建从相关对象到此对象的关系。这允许从相关对象中查询和过滤。

如果可以使用特定的关系名称而不仅仅是通用的,那就太好了。使用 docs:taged_item.bookmarks中的示例比 更具可读性taged_item.content_object,但实现它并不值得。

于 2016-10-09T21:01:17.200 回答