google-app-engine - 匹配 db.Model ListProperty 中一定数量项目的最有效方法

Question

class Foo(db.Model): bars = db.ListProperty(db.Key)

class Bar(db.Model): pass

如果我有某个 Foo 实体并且我想获取所有其他 foo 实体也包含其 bar ListProperty 中的某个 bar Key，我将使用以下查询：

related_foos = Foo.all().filter('bars', bar_entity).fetch(fetch_count)

如果我想找到模型类型 Foo 的所有其他实体，它们至少有 N 个匹配的 bar 实体怎么办？使用 for 循环执行此操作的明显方法会导致效率极低，最好实际更改模型本身以使其更容易，但如何做到这一点似乎并不明显。

score 2 · Accepted Answer

您可以简单地重复应用相同的过滤器：

related_foos = Foo.all().filter('bars', bar_entity).filter('bars', bar_entity_2).fetch(fetch_count)

或者，数据驱动：

q = Foo.all()
for bar in bar_entities:
  q.filter('bars', bar)
related_foos = q.fetch(fetch_count)

如果您不对查询应用任何不等式或排序顺序，则无论您应用多少过滤器，数据存储都将能够使用内置索引和合并连接策略执行查询。但是，如果您需要不等式或排序顺序，则需要为可能要过滤的每个柱数建立一个索引，这会导致索引爆炸（因此最好避免！）

score 1 · Accepted Answer

给定一个具有 10 个 bar_entities 的 foo 记录，并查找具有这 10 个实体中的至少 2 个的所有 foo 记录将导致 45 个可能的相等值 10!/(2!*(10-2)!)=45。

这可以在 10_C_(2-1)=10 次读取中推断出来。

SELECT * from table WHERE bar="1" AND bar in ["2", "3", "4", "5", "6", "7", "8", "9", "0"]
SELECT * from table WHERE bar="2" AND bar in ["3", "4", "5", "6", "7", "8", "9", "0"]
SELECT * from table WHERE bar="3" AND bar in ["4", "5", "6", "7", "8", "9", "0"]
etc.

要将其减少到一次读取，需要在添加 foo 记录时填充一个单独的表，该表具有给定记录的所有 2 个组合。

Say you had

foo_table
foo1 [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
foo2 [1, 3, 4]
foo3 [1, 2, a]
foo4 [b, 6, c]

foo_combo_2_table
Parent  Combination
foo1    12
foo1    13
... and all 45 foo1 combinations each in its own row
foo2    13
foo2    14
foo2    34
foo3    12
foo3    1a
foo3    2a
etc.

Now you can do a 

indexes = SELECT __KEY__ from foo_combo_2_table WHERE combination IN [12, 13, 14, 15, ... all 45]
keys = [k.parent() for k in indexes] # you would need to filter for duplicates

这样你就不会遇到任何爆炸性的索引问题。

如果您还想做任何 3 个或任何 4 个实体而不是每个实体，则需要创建一个 foo_combo_n_table 或执行 10_C_(n-1) 次读取。

google-app-engine - 匹配 db.Model ListProperty 中一定数量项目的最有效方法

2 回答 2

Related

Reference