titan - 在 Titan 中使用 order().by() 时索引不起作用

Question

混合索引支持原生且高效的排序。但是， order().by() 方法中使用的属性键必须事先添加到混合索引中，以支持原生结果排序。这在 order().by() 键与查询键不同的情况下很重要。如果属性键不是索引的一部分，那么排序需要将所有结果加载到内存中。

所以，我对prop1财产做了一个混合索引。指定值时，混合索引prop1效果很好。

gremlin> g.V().has('prop1', gt(1)) /* this gremlin uses the mixed index */
==>v[6017120]
==>v[4907104]
==>v[8667232]
==>v[3854400]
...

但是，当我使用order().by()on时，prop1我无法利用混合索引。

gremlin> g.V().order().by('prop1', incr) /* doesn't use the mixed index */
17:46:00 WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
Could not execute query since pre-sorting requires fetching more than 1000000 elements. Consider rewriting the query to exploit sort orders

也count()需要这么长时间。

gremlin> g.V().has('prop1').count()
17:44:47 WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [()]. For better performance, use indexes

如果我知道我有什么问题，我会很高兴。这是我的泰坦信息：

泰坦版本：1.0.0-hadoop1
存储后端：Cassandra 2.1.1
索引后端：ElasticSearch 1.7

谢谢你。

score 4 · Accepted Answer

您必须提供一个值来过滤要使用的索引。这里：

g.V().order().by('prop1', incr)

你没有提供任何过滤器，所以 Titan 必须迭代所有V()然后应用排序。

这里：

g.V().has('prop1').count()

您提供了一个索引键，但没有指定要过滤的值，因此它仍在迭代所有V(). 你可以这样做：

g.V().has("prop1", textRegex(".*")).count()

在这种情况下，您会稍微伪造 Titan，但如果该查询返回大量结果进行迭代，查询仍然可能很慢。

titan - 在 Titan 中使用 order().by() 时索引不起作用

1 回答 1

Related

Reference