3

我目前正在将一些旧的 Neo4j 相关代码迁移到新的 Neo4j 2.0.0 beta。我认为新的模式索引在很多情况下都是一个不错的功能,所以我想更改我的代码以尽可能使用它们。但在这样做之前,我想,我想确保我的表现不会更差。所以我写了一个小测试。令人惊讶的是,在查找方面,模式索引的性能始终比传统索引差。但在下结论之前,我想和你分享我的测试,这样你就可以告诉我我是否做了违法的事情,或者由于测试用例的简单性或类似问题,结果只是这样。此外,您可以自己尝试并确认/拒绝我的观察。因为就目前而言,我宁愿坚持使用遗留索引,

在我的代码下面。我只是通过注释掉对一种类型的索引(遗留或模式)的调用进行测试,然后运行整个事情几次。我尝试了各种 N 值,范围从此处所示的 1000 到 60000,始终具有相同的相对结果,旧索引执行显着更快的查找。显然,我的用例是很多节点,每个节点都有一个唯一的 ID,我需要尽快查找整个范围的节点,而我只有节点的 ID。

我的问题是:遗留索引真的更快吗?如果这对我来说是一个主要问题,或者我做错了什么,或者这是一个已知问题,并且将在测试期间解决并在发布中解决,我应该坚持使用它们吗?谢谢!

import java.io.File;
import java.io.IOException;
import java.util.concurrent.TimeUnit;

import org.apache.commons.io.FileUtils;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Label;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.ResourceIterator;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.graphdb.index.Index;
import org.neo4j.graphdb.schema.IndexDefinition;
import org.neo4j.graphdb.schema.Schema;
import org.neo4j.tooling.GlobalGraphOperations;

enum labels implements Label {
    term
}

public class Neo4jIndexPerformanceTest {
    private static int N = 1000;

    public static void main(String[] args) throws IOException {
        FileUtils.deleteDirectory(new File("tmp/graph.db"));
        GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase("tmp/graph.db");
        try (Transaction tx = graphDb.beginTx()) {
            int i = 0;
            for (Node n : GlobalGraphOperations.at(graphDb).getAllNodes())
                i++;
            System.out.println("Number of nodes: " + i);
        }
//      createLegacyIndex(graphDb);
//      searchLegacyIndex(graphDb);
        createSchemaIndex(graphDb);
        searchSchemaIndex(graphDb);
        graphDb.shutdown();
    }

    private static void searchSchemaIndex(GraphDatabaseService graphDb) {
        try (Transaction tx = graphDb.beginTx()) {
            IndexDefinition index = graphDb.schema().getIndexes(labels.term).iterator().next();
            graphDb.schema().awaitIndexOnline(index, 10, TimeUnit.SECONDS);
        }
        long time = System.currentTimeMillis();
        try (Transaction tx = graphDb.beginTx()) {
            for (int i = 0; i < N; i++) {
                ResourceIterator<Node> iterator = graphDb.findNodesByLabelAndProperty(labels.term, "id", "schema:" + i).iterator();
                if (iterator.hasNext()) {
                    Node n = iterator.next();
                } 
                iterator.close();
            }
        }
        time = System.currentTimeMillis() - time;
        System.out.println("Searching schema index took: " + time + " ms");
    }

    private static void searchLegacyIndex(GraphDatabaseService graphDb) {
        long time = System.currentTimeMillis();
        try (Transaction tx = graphDb.beginTx()) {
            Index<Node> index = graphDb.index().forNodes("terms");
            for (int i = 0; i < N; i++) {
                ResourceIterator<Node> iterator = index.get("id", "legacy:" + i).iterator();
                if (iterator.hasNext()) {
                    Node single = iterator.next();
                }
                iterator.close();
                // if (single == null)
                // throw new IllegalStateException();
            }
        }
        time = System.currentTimeMillis() - time;
        System.out.println("Searching legacy index took: " + time + " ms");

    }

    private static void createSchemaIndex(GraphDatabaseService graphDb) {
        Schema schema = null;
        try (Transaction tx = graphDb.beginTx()) {
            schema = graphDb.schema();
            boolean e = false;
            for (IndexDefinition id : graphDb.schema().getIndexes()) {
                e = true;
            }
            if (!e)
                schema.indexFor(labels.term).on("id").create();
            tx.success();
        }
        try (Transaction tx = graphDb.beginTx()) {
            long time = System.currentTimeMillis();

            for (int i = 0; i < N; i++) {
                Node n = graphDb.createNode(labels.term);
                n.setProperty("id", "schema:" + i);
            }

            time = System.currentTimeMillis() - time;
            schema.awaitIndexesOnline(10, TimeUnit.SECONDS);
            tx.success();
            System.out.println("Creating schema index took: " + time + " ms");
        }
    }

    private static void createLegacyIndex(GraphDatabaseService graphDb) {
        try (Transaction tx = graphDb.beginTx()) {
            Index<Node> index = graphDb.index().forNodes("terms");

            long time = System.currentTimeMillis();

            for (int i = 0; i < N; i++) {
                Node n = graphDb.createNode(labels.term);
                n.setProperty("id", "legacy:" + i);
                index.add(n, "id", n.getProperty("id"));
            }

            time = System.currentTimeMillis() - time;
            tx.success();
            System.out.println("Creating legacy index took: " + time + " ms");
        }
    }
}
4

1 回答 1

3

我尝试了您的代码,并且确实架构索引实现不如旧版那么快。但是我找到了原因,这是围绕索引的实现中的一个简单错误,而不是索引本身。我尝试在本地修复这些错误,它们与遗留索引和模式索引完全相同。

所以这是一个正确修复的问题,我只能希望它能够进入 2.0 版本。

于 2013-11-17T23:53:40.407 回答