1

我正在尝试使用 Neo4j 运行此查询,但运行时间太长(超过 30 分钟,几乎 2500 个节点和 180 万个关系)才能运行:

Match (a:Art)-[r1]->(b:Art)  
with collect({start:a.url,end:b.url,score:r1.ed_sc}) as row1

MATCH (a:Art)-[r1]->(b:Art)-[r2]->(c:Art)

Where a.url<>c.url
with row1 + collect({start:a.url,end:c.url,score:r1.ed_sc*r2.ed_sc}) as row2

Match (a:Art)-[r1]->(b:Art)-[r2]->(c:Art)-[r3]->(d:Art)

WHERE a.url<>c.url and b.url<>d.url and a.url<>d.url


with row2+collect({start:a.url,end:d.url,score:r1.ed_sc*r2.ed_sc*r3.ed_sc}) as allRows

unwind allRows as row

RETURN row.start as start ,row.end as end , sum(row.score) as final_score limit 10;

:Art是标签下有 2500 个节点,这些节点之间存在双向关系,具有称为 的属性ed_sc。所以基本上我试图通过遍历一、二和三度路径来找到两个节点之间的分数,然后对这些分数求和。

有没有更优化的方法来做到这一点?

4

1 回答 1

0

一方面,我不鼓励使用双向关系。如果你的图是密集连接的,这种建模将对大多数这样的查询造成严重破坏。

假设url每个 :Art 节点都是唯一的,最好比较节点本身而不是它们的属性。

我们还应该能够使用可变长度关系来代替您当前的方法:

MATCH p = (start:Art)-[*..3]->(end:Art)  
WHERE all(node in nodes(p) WHERE single(t in nodes(p) where node = t))
WITH start, end, reduce(score = 1, rel in relationships(p) | score * rel.ed_sc) as score
WITH start, end, sum(score) as final_score
LIMIT 10
RETURN start.url as start, end.url as end, final_score
于 2018-06-05T16:12:19.610 回答