python - py2neo：取决于批量插入

Question

我使用 py2neo (v 1.9.2) 将数据写入 neo4j 数据库。

batch = neo4j.WriteBatch(graph_db)
current_relationship_index = graph_db.get_or_create_index(neo4j.Relationship, "Current_Relationship")
touched_relationship_index = graph_db.get_or_create_index(neo4j.Relationship, "Touched_Relationship")
get_rel = current_relationship_index.get(some_key1, some_value1)
if len(get_rel) == 1:
    batch.add_indexed_relationship(touched_relationship_index, some_key2, some_value2, get_rel[0])
elif len(get_rel) == 0:
    created_rel = current_relationship_index.create(some_key3, some_value3, (my_start_node, "KNOWS", my_end_node))
    batch.add_indexed_relationship(touched_relationship_index, some_key4, "touched", created_rel)
batch.submit()

有没有办法用批处理命令替换 current_relationship_index.get(..) 和 current_relationship_index.create(...) ？我知道有一个，但问题是，我需要根据这些命令的返回来采取行动。由于性能原因，我希望将所有语句放在一个批次中。

我读过索引关系是相当罕见的，但我这样做的原因如下：我需要每天解析一些（文本）文件，然后需要检查是否有任何关系在前一天发生了变化，即如果文本文件中不再存在关系我想用 neo4j 中的“替换”属性标记它。因此，我将所有“接触”关系添加到适当的索引中，因此我知道这些并没有改变。所有不在touched_relationship_index中的关系显然不再存在，所以我可以标记它们。

我想不出一种更简单的方法来做到这一点，即使我确信 py2neo 提供了一个。

编辑：考虑到奈杰尔的评论，我尝试了这个：

my_rel = batch.get_or_create_indexed_relationship(current_relationship_index, some_key, some_value, my_start_node, my_type, my_end_node)
batch.add_indexed_relationship(touched_relationship_index, some_key2, some_value2, my_rel)
batch.submit()

这显然不起作用，因为我不能在批处理中引用“my_rel”。我该如何解决这个问题？用“0”表示上一个批处理语句的结果？但是考虑到整个事情应该在一个循环中运行，所以数字不是固定的。也许使用一些变量“batch_counter”，它指的是当前的批处理语句并且总是递增，每当一个语句被添加到批处理中时？

score 0 · Accepted Answer

看看WriteBatch.get_or_create_indexed_relationship。这可以根据当前是否存在并以原子方式操作来有条件地创建一种关系。文档链接如下：

http://book.py2neo.org/en/latest/batches/#py2neo.neo4j.WriteBatch.get_or_create_indexed_relationship

py2neo 中有一些类似的唯一性管理工具，我最近在博客中介绍了这些工具，您可能想了解一下。

python - py2neo：取决于批量插入

1 回答 1

Related

Reference