python - 与mysql相比neo4j性能（如何提高？）

Question

这是无法重现/验证图形数据库中的性能声明和操作书中的 neo4j 的后续行动。我已经更新了设置和测试，不想过多地改变原来的问题。

整个故事（包括脚本等）在https://baach.de/Members/jhb/neo4j-performance-compared-to-mysql

简短版本：在尝试验证“图形数据库”一书中的性能声明时，我得到了以下结果（查询一个包含 n 人的随机数据集，每个人有 50 个朋友）：

My results for 100k people

depth    neo4j             mysql       python

1        0.010             0.000        0.000
2        0.018             0.001        0.000
3        0.538             0.072        0.009
4       22.544             3.600        0.330
5     1269.942           180.143        0.758

“*”：仅单次运行

My results for 1 million people

depth    neo4j             mysql       python

1        0.010             0.000        0.000
2        0.018             0.002        0.000
3        0.689             0.082        0.012
4       30.057             5.598        1.079
5     1441.397*          300.000        9.791

“*”：仅单次运行

在 64 位 ubuntu 上使用 1.9.2 我已经使用以下值设置了 neo4j.properties：

neostore.nodestore.db.mapped_memory=250M
neostore.relationshipstore.db.mapped_memory=2048M

和 neo4j-wrapper.conf ：

wrapper.java.initmemory=1024
wrapper.java.maxmemory=8192

我对 neo4j 的查询如下所示（使用 REST api）：

start person=node:node_auto_index(noscenda_name="person123") match (person)-[:friend]->()-[:friend]->(friend) return count(distinct friend);

Node_auto_index 很明显

我能做些什么来加快 neo4j 的速度（比 mysql 更快）吗？

Stackoverflow中还有另一个基准测试存在同样的问题。

score 4 · Accepted Answer

很抱歉，您无法重现结果。但是，在 MacBook Air（1.8 GHz i7，4 GB RAM）上，具有 2 GB 堆、GCR 缓存，但没有缓存预热，也没有其他调整，具有类似大小的数据集（100 万用户，每人 50 个朋友），我在 1.9.2 上使用 Traversal Framework 反复得到大约 900 毫秒：

public class FriendOfAFriendDepth4
{
    private static final TraversalDescription traversalDescription = 
         Traversal.description()
            .depthFirst()
            .uniqueness( Uniqueness.NODE_GLOBAL )
            .relationships( withName( "FRIEND" ), Direction.OUTGOING )
            .evaluator( new Evaluator()
            {
                @Override
                public Evaluation evaluate( Path path )
                {
                    if ( path.length() >= 4 )
                    {
                        return Evaluation.INCLUDE_AND_PRUNE;
                    }
                    return Evaluation.EXCLUDE_AND_CONTINUE;

                }
            } );

    private final Index<Node> userIndex;

    public FriendOfAFriendDepth4( GraphDatabaseService db )
    {
        this.userIndex = db.index().forNodes( "user" );
    }

    public Iterator<Path> getFriends( String name )
    {
        return traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                .iterator();
    }

    public int countFriends( String name )
    {
        return  count( traversalDescription.traverse( 
            userIndex.get( "name", name ).getSingle() )
                 .nodes().iterator() );
    }
}

Cypher 速度较慢，但远没有您建议的那么慢：大约 3 秒：

START person=node:user(name={name})
MATCH (person)-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->()-[:FRIEND]->(friend)
RETURN count(friend)

亲切的问候

伊恩

score 3 · Accepted Answer

3

是的，我相信 REST API 比常规绑定慢得多，这就是您的性能问题。

于 2014-07-16T04:05:37.507 回答

python - 与mysql相比neo4j性能（如何提高？）

2 回答 2

Related

Reference