database - Titan 索引更新耗时过长

Question

即使在空数据库上，在 Titan 1.0 中创建索引也需要几分钟时间。时间似乎很准确，这表明存在不必要的延迟。

我的问题是：如何缩短或消除 Titan 重新索引所需的时间？ 从概念上讲，由于没有完成任何工作，所以时间应该是最少的，当然不是四分钟。

（注意，我之前已经指出了一个解决方案，它只是让 Titan 等待完全延迟而不超时。这是错误的解决方案 - 我想完全消除延迟。）

我用来从头开始设置数据库的代码是：

graph = ... a local cassandra instance ...
graph.tx().rollback()

// 1. Check if the index already exists
mgmt = graph.openManagement()
i = mgmt.getGraphIndex('byIdent')
if(! i) {
  // 1a. If the index does not exist, add it
  idKey = mgmt.getPropertyKey('ident')
  idKey = idKey ? idKey : mgmt.makePropertyKey('ident').dataType(String.class).make()
  mgmt.buildIndex('byIdent', Vertex.class).addKey(idKey).buildCompositeIndex()
  mgmt.commit()
  graph.tx().commit()

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // 1b. Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  }
  // 1c. Now reindex, even though the DB is usually empty.
  mgmt.updateIndex(mgmt.getGraphIndex('byIdent'), SchemaAction.REINDEX).get()
  mgmt.commit()
  mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.ENABLED).call()
} else { mgmt.commit() }

它似乎是updateIndex...REINDEX阻塞直到超时的调用。这是一个已知问题还是无法修复的工作表？难道我做错了什么？

编辑：禁用 REINDEX，正如评论中所讨论的，实际上并不是一个修复，因为索引似乎没有变得活跃。我现在看到：

WARN  com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx  - Query requires iterating over all vertices [(myindexedkey = somevalue)]. For better performance, use indexes

score 3 · Accepted Answer

由于我对 Titan 的误用，时间延迟是/完全没有必要的（尽管该模式确实出现在 Titan 1.0.0 文档第 28 章中）。

不要阻塞交易！

代替：

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // 1b. Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  }

考虑：

  mgmt  = graph.openManagement()
  idKey = mgmt.getPropertyKey('ident')
  idx   = mgmt.getGraphIndex('byIdent')
  // Wait for index availability
  if ( idx.getIndexStatus(idKey).equals(SchemaStatus.INSTALLED) ) {
    mgmt.commit()
    mgmt.awaitGraphIndexStatus(graph, 'byIdent').status(SchemaStatus.REGISTERED).call()
  } else { mgmt.commit() }

使用 ENABLE_INDEX

不是：mgmt.updateIndex(mgmt.getGraphIndex('byIdent'), SchemaAction.REINDEX).get()

相当：mgmt.updateIndex(mgmt.getGraphIndex('byIdent'),SchemaAction.ENABLE_INDEX).get()

database - Titan 索引更新耗时过长

1 回答 1

Related

Reference