6

is there a default way how to match only first n relationships except that filtering on LIMIT n later?

i have this query:

START n=node({id})
MATCH n--u--n2
RETURN u, count(*) as cnt order by cnt desc limit 10;

but assuming the number of n--u relationships is very high, i want to relax this query and took for example first 100 random relationships and than continue with u--n2...

this is for a collaborative filtering task, and assuming the users are more-less similar i dont want to match all users u but a random subset. this approach should be faster in performance - now i got ~500ms query time but would like to drop it under 50ms.

i know i could break the above query into 2 separate ones, but still in the first query it goes through all users and than later it limits the output. i want to limit the max rels during match phase.

4

2 回答 2

3

您可以使用 管道查询当前结果WITH,然后使用LIMIT这些初始结果,然后在同一查询中继续:

START n=node({id})
MATCH n--u
WITH u
LIMIT 10
MATCH u--n2
RETURN u, count(*) as cnt 
ORDER BY cnt desc 
LIMIT 10;

上面的查询会给你u找到的前 10 个,然后继续寻找前 10 个匹配n2的。

或者,您可以省略第二个LIMIT,您将获得前十个的所有匹配s (这意味着如果它们与前 10 个匹配,您可能会返回十多个行)。n2uu

于 2013-04-25T19:18:01.477 回答
1

这不是您问题的直接解决方案,但由于我遇到了类似的问题,我的解决方法可能对您来说很有趣。

我需要做的是:通过索引获取关系(可能会产生数千个)并获取这些关系的起始节点。由于起始节点始终与该索引查询相同,因此我只需要第一个关系的起始节点。

由于我无法使用 cypher 实现这一点(由 ean5533 提出的查询并没有更好的性能),我正在使用一个简单的非托管扩展不错的模板)。

@GET
@Path("/address/{address}")
public Response getUniqueIDofSenderAddress(@PathParam("address") String addr, @Context GraphDatabaseService graphDB) throws IOException
{
    try {
        RelationshipIndex index = graphDB.index().forRelationships("transactions");
        IndexHits<Relationship> rels = index.get("sender_address", addr);

        int unique_id = -1;
        for (Relationship rel : rels) {
            Node sender = rel.getStartNode();
            unique_id = (Integer) sender.getProperty("unique_id");
            rels.close();
            break;
        }

        return Response.ok().entity("Unique ID: " + unique_id).build();
    } catch (Exception e) {
        return Response.serverError().entity("Could not get unique ID.").build();
    }
}

对于这里的这种情况,加速非常好。

我不知道您的确切用例,但由于 Neo4j 甚至支持 HTTP 流式传输 afaik,您应该能够创建以将您的查询转换为非托管扩展并仍然获得完整的性能。例如,“java 查询”所有符合条件的节点并将部分结果发送到 HTTP 流。

于 2013-04-30T18:41:29.557 回答