amazon-dynamodb - Titan - 具有混合索引的 has() 的奇怪行为

Question

我有一个带有 ES 后端和 DynamoDB 的 Titan 图，用于持久性。

方法has("mykey", "value")从不检索顶点。在查询mykey由 Elasticsearch 索引的 a 时，它总是不返回任何内容。索引已更新并启用。

运行此查询时，

gremlin>  graph.indexQuery("verticesIndex2", "v.mykey:myvalue").vertices().asList().size()
==>1  // It works here!! The vertex is retrieved successfully.
gremlin> g.V().has("mykey", "myvalue").hasNext()
==>false // doesn't retrieve anything!!!
gremlin> g.V(16998408).values("mykey")
==>myvalue // the vertex exists in my graph for sure !!

我尝试了一个技巧来让它工作

gremlin> g.V().has("mykey").has("mykey", "myvalue").next() 
19:49:44 WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [()]. For better performance, use indexes
==>v[16998408] // It works !!

这似乎是某个地方的问题，但不确定具体在哪里。对此有什么想法吗？

score 0 · Accepted Answer

我对 lucene 索引有类似的问题 - 包括相同的索引使用症状。

请注意，在不检索任何内容的查询中，它也不会抱怨缺少索引。但是在执行的查询中，它抱怨必须遍历所有顶点。

我怀疑是索引失败了——简单的 has("...") 操作首先需要非索引搜索，所以成功，但每次使用索引搜索时，都会失败。

score 0 · Accepted Answer

我正在使用 ES 和 HBase，我也有同样的问题。

当我使用 ES 为 String 类型构建混合索引时，当使用类似的东西进行查询时

g.V().has("mykey", "myvalue").hasNext()

它警告我我没有使用索引，而且查询速度很慢。

但是当我使用 ES 为 Integer 类型构建混合索引时，像这样查询

g.V().has("myInt", "myIntValue").hasNext()

它什么也不警告，而且查询速度相当快。

所以现在我对字符串类型使用复合索引来避免这种情况

amazon-dynamodb - Titan - 具有混合索引的 has() 的奇怪行为

2 回答 2

Related

Reference