2

我有大量的联系人和关系要插入(数百万)。为了加快速度,我想我会将它们批量化,然后让多个线程同时插入它们。这会导致一些死锁,但因为我可以重试它们,所以我没有问题。

   public void doBatch(final Collection<Object> rows) throws Exception {
    int retryCount = 3;
    while(!(retryCount<3)) {
        Transaction tx = graphdb.beginTx();
        try {
            for (Object row : rows) {
                String[] fields = ((String) row).split(DELIMITER, -1);
                if (fields.length < 4) {
                    log.error("Not enough fields to process row:" + row);
                } else {
                    addLineToGraph(fields[0], fields[1], fields[2], fields[3]);
                }
            }
            tx.success();
            retryCount = 0;
        } catch (DeadlockDetectedException dead) {
            tx.failure();
            retryCount--;
            log.warn("Retry deadlock");
        } catch (Exception e) {
            tx.failure();
            throw e;
        } finally {
            tx.finish();
        }
    }
}

不幸的是,在运行了几个小时并且出现了很多死锁之后,即使尝试了 10G 堆,我的内存也用完了(超出了 GC 开销限制)。在分析堆栈转储后,我注意到很多很多锁:

One instance of "org.neo4j.kernel.impl.transaction.RWLock" loaded by "sun.misc.Launcher$AppClassLoader @ 0xc0271350" occupies 672.139.928 (84,78%) bytes.
The memory is accumulated in one instance of "java.util.HashMap$Entry[]" loaded by "<system class loader>".

我的印象是这是由于失败的事务没有释放锁造成的,所以我将我的代码限制为单个线程,这将确保不再发生死锁。完成此操作后,我可以看到由垃圾收集引起的正常锯齿图,并且不再出现内存不足错误。据我了解 tx.finish(); 应该清理一切吗?或者我在这里错过了什么?

我在嵌入式模式下使用 neo4j 2.0.0-M03。

4

2 回答 2

0

每当您更新关系节点的任何属性然后释放锁时,如何使用锁

于 2013-09-10T07:34:00.033 回答
0

I upgraded to 2.0.0-M05 and now I get different behavior. I'm getting a nullpointer on the PersistenceWindowPool class. At least at the moment this class is not perfectly thread safe. They told me it would be resolved in 2.0 but until this happens I'm using my own synchronized version of this class.

https://github.com/bennies/neo4j/commit/d8a0f4732f347f2038ebace83c14d37d4b1f8691

Thanks for all the idea's for alternative solutions :)

于 2013-10-18T13:35:12.963 回答