performance - 如何配置让neo4j更快？

Question

我尝试使用neo4j 做一些关于SNS 的实验。我创建了一个由 100 万用户、10 万个项目组成的随机图，每个用户有大约 100 个朋友和 100 个最喜欢的项目。因此图中大约有 100 万个节点和 2 亿个关系，图文件占用 4.8GB。所有节点只有一个 id，我为它们创建了索引。现在我使用 Java API 建立了一个小型集群来维护这个图，它由三个 VM 组成。每个 VM 有16GB内存，Intel Xeon CPU 2.00GHz（8 核）。下面是一些配置：

config.put( "neostore.nodestore.db.mapped_memory", "150M");
config.put("neostore.relationshipstore.db.mapped_memory", "5G");
config.put( "neostore.propertystore.db.mapped_memory", "100M");
config.put( "neostore.propertystore.db.strings.mapped_memory", "130M");
config.put( "neostore.propertystore.db.arrays.mapped_memory", "130M");
config.put( "node_auto_indexing", "true");
config.put( "use_memory_mapped_buffers", "true");
config.put( "neostore.propertystore.db.index.keys.mapped_memory", "150M");
config.put( "neostore.propertystore.db.index.mapped_memory", "150M");

我使用 gcr cache_type。我只是通过遍历来预热图表：

for ( Node n : GlobalGraphOperations.at(db).getAllNodes() ) {
    n.getPropertyKeys();
    for ( Relationship relationship : n.getRelationships() ) {
        start = relationship.getStartNode();
    }
}

密码查询：

start user=node:users({key}={value}) match user-[:FRIEND]->(friend)-[:LIKES]->(item) return item, collect(friend), count(0) order by count(0) desc limit 32;

，这意味着找出朋友最喜欢的物品。我使用以下命令运行 jar： java -d64 -server -XX:+UseConcMarkSweepGC -XX:+UseNUMA -Xms10752m -Xmx10752m -Xmn2688m -jar Neo4J-1.0-SNAPSHOT.jar

现在，我的实验结果：（1）单线程每个查询平均花费大约70ms。(2) 8-thread 每个查询平均耗时160ms左右，很多查询耗时500ms以上。RPS 约为 50/秒。

我想提高性能，但不知道如何。看来内存不足以保存所有数据，对吗？另外，我试过soft 和 strong cache_type，ram 热身的时候很快就满了。

请帮助我并教我如何改进它。非常感谢。

score 0 · Accepted Answer

如果堆大小/可用 RAM 太小而无法在对象缓存中保存完整数据集，则可以使用企业版。通过在你的 n 个 Neo4j 实例前面放置一个负载均衡器，它将对图的某个部分的所有请求路由到同一个实例，你基本上可以进行对象缓存分片。Jim Webber 关于这种方法的博客：http: //jim.webber.name/2011/02/scaling-neo4j-with-cache-sharding-and-neo4j-ha/

对于性能关键的查询，使用遍历 API将 Cypher 查询重构为等效查询甚至是核心 API 可能是一个想法。

performance - 如何配置让neo4j更快？

1 回答 1

Related

Reference