11

注意:在#postgresql 上的 RhodiumToad 的帮助下,我找到了一个解决方案,我将其发布为答案。如果有人可以改进这一点,请加入!

我无法使以前的递归查询解决方案适应以下包含多个“根”(无祖先)节点的有向无环图。我正在尝试编写一个查询,其输出通常称为闭包表:一个多对多表,存储从每个节点到其每个后代和自身的每条路径:

1  2  11  8  4  5  7
 \/    |  |   \ | /
  3    |   \    6
   \   |    \  /
    9  |     10
     \/     /
     12    13
       \  /
        14

CREATE TABLE node (
  id        SERIAL PRIMARY KEY,
  node_name VARCHAR(50) NOT NULL
);

CREATE TABLE node_relations (
  id                 SERIAL PRIMARY KEY,
  ancestor_node_id   INT REFERENCES node(id),
  descendant_node_id INT REFERENCES node(id)
);

INSERT into node (node_name)
SELECT 'node ' || g FROM generate_series(1,14) g;

INSERT INTO node_relations(ancestor_node_id, descendant_node_id) VALUES
(1,3),(2,3),(4,6),(5,6),(7,6),(3,9),(6,10),(8,10),(9,12),(11,12),(10,13),(12,14),(13,14);

很难查明问题 - 我是否缺少node_relation行?查询错了吗?

WITH RECURSIVE node_graph AS (
   SELECT ancestor_node_id, ARRAY[descendant_node_id] AS path, 0 AS level
   FROM   node_relations

   UNION  ALL
   SELECT nr.ancestor_node_id,  ng.path || nr.descendant_node_id,ng.level + 1 AS level
   FROM   node_graph ng
   JOIN   node_relations nr ON nr.descendant_node_id = ng.ancestor_node_id 
)
SELECT path[array_upper(path,1)] AS ancestor,
       path[1] AS descendant,
       path, 
       level as depth
FROM   node_graph
ORDER  BY level, ancestor;

预期输出:

ancestor | descendant | path
---------+------------+------------------
1        | 3          | "{1,3}"
1        | 9          | "{1,3,9}"
1        | 12         | "{1,3,9,12}"
1        | 14         | "{1,3,9,12,14}"
2        | 3          | "{2,3}"
2        | 9          | "{2,3,9}"
2        | 12         | "{2,3,9,12}"
2        | 14         | "{2,3,9,12,14}"
3        | 9          | "{3,9}"
3        | 12         | "{3,9,12}"
3        | 14         | "{3,9,12,14}"
4        | 6          | "{4,6}"
4        | 10         | "{4,6,10}"
4        | 13         | "{4,6,10,13}"
4        | 14         | "{4,6,10,13,14}"
5        | 6          | "{5,6}"
5        | 10         | "{5,6,10}"
5        | 13         | "{5,6,10,13}"
5        | 14         | "{5,6,10,13,14}"
6        | 10         | "{6,10}"
6        | 13         | "{6,10,13}"
6        | 14         | "{6,10,13,14}"
7        | 6          | "{7,6}"
7        | 10         | "{7,6,10}"
7        | 13         | "{7,6,10,13}"
7        | 14         | "{7,6,10,13,14}"
8        | 10         | "{8,10}"
8        | 13         | "{8,10,13}"
8        | 14         | "{8,10,13,14}"
9        | 12         | "{9,12}"
9        | 14         | "{9,12,14}"
10       | 13         | "{10,13}"
10       | 14         | "{10,13,14}"
11       | 12         | "{11,12}"
11       | 14         | "{11,12,14}"
12       | 14         | "{12,14}"
13       | 14         | "{13,14}"
4

3 回答 3

10

在#postgresql 上的 RhodiumToad 的帮助下,我找到了这个解决方案:

WITH RECURSIVE node_graph AS (
    SELECT ancestor_node_id as path_start, descendant_node_id as path_end,
           array[ancestor_node_id, descendant_node_id] as path 
    FROM node_relations

    UNION ALL 

    SELECT ng.path_start, nr.descendant_node_id as path_end,
           ng.path || nr.descendant_node_id as path
    FROM node_graph ng
    JOIN node_relations nr ON ng.path_end = nr.ancestor_node_id
) 
SELECT * from node_graph order by path_start, array_length(path,1);

结果完全符合预期。

于 2014-09-10T19:19:52.830 回答
3

另一种方法是以相反的顺序遍历图形:

WITH RECURSIVE cte AS (
   SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path
   FROM   node_relations r
   LEFT   JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id
   WHERE  r0.ancestor_node_id IS NULL  -- start at the end

   UNION ALL 
   SELECT r.ancestor_node_id || c.path
   FROM   cte c
   JOIN   node_relations r ON r.descendant_node_id = c.path[1]
   ) 
SELECT path
FROM   cte
ORDER  BY path;

这会产生一个子集,其中包含从每个根节点到其最终后代的每条路径。对于也散布很多的深层树,这将需要更少的连接操作。要另外添加每个子路径,您可以将LATERAL连接附加到外部SELECT

WITH RECURSIVE cte AS (
   SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path
   FROM   node_relations r
   LEFT   JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id
   WHERE  r0.ancestor_node_id IS NULL  -- start at the end

   UNION ALL 
   SELECT r.ancestor_node_id || c.path
   FROM   cte c
   JOIN   node_relations r ON r.descendant_node_id = c.path[1]
   ) 
SELECT l.path
FROM   cte, LATERAL (
   SELECT path[1:g] AS path
   FROM   generate_series(2, array_length(path,1)) g
   ) l
ORDER  BY l.path;

我进行了快速测试,但它的运行速度并不比 RhodiumToad 的解决方案快。对于大表或宽表,它可能仍然更快。尝试使用您的数据。

于 2015-09-28T02:48:24.907 回答
1

我看到查询有两个问题:

  1. 非递归部分没有指定根节点。您需要明确选择使用WHERE descendant_node_id = 14或“动态”使用:

    SELECT ..
    FROM   node_relations nr1
    WHERE  NOT EXISTS (SELECT 1
                       FROM node_relations nr2
                       WHERE nr2.ancestor_node_id = nr1.descendant_node_id)
    
  2. 有了正确的起点,路径是不完整的,因为它会在递归部分的聚合过程中错过最后一个节点。因此,在外部查询中,您需要附加ancestor_node_id到生成的路径。

所以查询看起来像这样:

WITH RECURSIVE node_graph AS (
   SELECT nr1.id, nr1.ancestor_node_id, nr1.descendant_node_id, ARRAY[nr1.descendant_node_id] AS path, 0 as level
   FROM   node_relations nr1
   WHERE  NOT EXISTS (SELECT 1
                      FROM node_relations nr2
                      WHERE nr2.ancestor_node_id = nr1.descendant_node_id)

   UNION  ALL

   SELECT nr.id, nr.ancestor_node_id, nr.descendant_node_id, nr.descendant_node_id || ng.path, ng.level + 1 as level
   FROM node_relations nr
     JOIN node_graph ng ON ng.ancestor_node_id = nr.descendant_node_id

)
SELECT ancestor_node_id||path as path, -- add the last element of the sub-tree to the path
       level as depth
FROM   node_graph
ORDER  BY level

这是 SQLFiddle:http ://sqlfiddle.com/#!15/e646b/3

于 2014-09-10T11:46:41.107 回答