15

我不是 SQL 专家,但如果有人可以帮助我。

我使用递归 CTE 来获取如下值。

子 1 --> 父 1

父 1 --> 父 2

父 2 --> NULL

如果数据填充出错了,那么我会有类似下面的内容,因此 CTE 可能会进入无限递归循环并给出最大递归错误。由于数据量很大,我无法手动检查这些不良数据。请让我知道是否有办法找到它。

子 1 --> 父 1

父 1 --> 子 1

或者

子 1 --> 父 1

父 1 --> 父 2

父 2 --> 子 1

4

6 回答 6

35

使用 Postgres,很容易通过将所有访问过的节点收集到一个数组中来防止这种情况发生。

设置:

create table hierarchy (id integer, parent_id integer);

insert into hierarchy
values
(1, null), -- root element
(2, 1), -- first child
(3, 1), -- second child
(4, 3), 
(5, 4), 
(3, 5); -- endless loop

递归查询:

with recursive tree as (
  select id, 
         parent_id, 
         array[id] as all_parents
  from hierarchy
  where parent_id is null
  
  union all
  
  select c.id, 
         c.parent_id,
         p.all_parents||c.id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
     and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
)
select *
from tree;

要同时对多棵树执行此操作,您需要将根节点的 ID 传递给子节点:

with recursive tree as (
  select id, 
         parent_id, 
         array[id] as all_parents, 
         id as root_id
  from hierarchy
  where parent_id is null
  
  union all
  
  select c.id, 
         c.parent_id,
         p.all_parents||c.id, 
         p.root_id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
     and c.id <> ALL (p.all_parents) -- this is the trick to exclude the endless loops
     and c.root_id = p.root_id
)
select *
from tree;

Postgres 14 的更新

Postgres 14 引入了(符合标准的)CYCLE选项来检测周期:

with recursive tree as (
  select id, 
         parent_id
  from hierarchy
  where parent_id is null

  union all

  select c.id, 
         c.parent_id
  from hierarchy c
     join tree p
      on c.parent_id = p.id 
)
cycle id -- track cycles for this column
   set is_cycle -- adds a boolean column is_cycle
   using path -- adds a column that contains all parents for the id
select *
from tree
where not is_cycle
于 2015-07-31T12:05:01.533 回答
10

您还没有指定方言或列名,所以很难做出完美的例子......

-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
    DROP TABLE #MyTable

CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2 
(3, 2, 'SubChild')
-- End random data

;WITH RecursiveCTE (StartingID, Level, Parents, Loop, ID, ParentID, Description) AS
(
    SELECT ID, 1, '|' + CAST(ID AS VARCHAR(MAX)) + '|', 0, * FROM #MyTable
    UNION ALL
    SELECT R.StartingID, R.Level + 1, 
        R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
        CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
        MT.*
        FROM #MyTable MT
        INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)

SELECT StartingID, Level, Parents, MAX(Loop) OVER (PARTITION BY StartingID) Loop, ID, ParentID, Description 
    FROM RecursiveCTE 
    ORDER BY StartingID, Level

像这样的东西将显示递归 cte 中是否/在哪里有循环。看专栏Loop。数据保持不变,没有循环。在评论中有关于如何更改值以导致循环的示例。

最后,递归 cteVARCHAR(MAX)以形式|id1|id2|id3|(称为Parents)创建一个 id,然后检查当前ID是否已经在该“列表”中。如果是,则将该Loop列设置为 1。此列在递归连接 ( ABD R.Loop = 0) 中被选中。

结束查询使用 a将整个“块”链MAX() OVER (PARTITION BY ...)的列设置为 1 。Loop

稍微复杂一点,生成一个“更好”的报告:

-- Some random data
IF OBJECT_ID('tempdb..#MyTable') IS NOT NULL
    DROP TABLE #MyTable

CREATE TABLE #MyTable (ID INT PRIMARY KEY, ParentID INT NULL, Description VARCHAR(100))
INSERT INTO #MyTable (ID, ParentID, Description) VALUES
(1, NULL, 'Parent'), -- Try changing the second value (NULL) to 1 or 2 or 3
(2, 1, 'Child'), -- Try changing the second value (1) to 2 
(3, 3, 'SubChild')
-- End random data

-- The "terminal" childrens (that are elements that don't have childrens
-- connected to them)
;WITH WithoutChildren AS
(
    SELECT MT1.* FROM #MyTable MT1
        WHERE NOT EXISTS (SELECT 1 FROM #MyTable MT2 WHERE MT1.ID != MT2.ID AND MT1.ID = MT2.ParentID)
)

, RecursiveCTE (StartingID, Level, Parents, Descriptions, Loop, ParentID) AS
(
    SELECT ID, -- StartingID 
        1, -- Level
        '|' + CAST(ID AS VARCHAR(MAX)) + '|', 
        '|' + CAST(Description AS VARCHAR(MAX)) + '|', 
        0, -- Loop
        ParentID
        FROM WithoutChildren
    UNION ALL
    SELECT R.StartingID, -- StartingID
        R.Level + 1, -- Level
        R.Parents + CAST(MT.ID AS VARCHAR(MAX)) + '|',
        R.Descriptions + CAST(MT.Description AS VARCHAR(MAX)) + '|', 
        CASE WHEN R.Parents LIKE '%|' + CAST(MT.ID AS VARCHAR(MAX)) + '|%' THEN 1 ELSE 0 END,
        MT.ParentID
        FROM #MyTable MT
        INNER JOIN RecursiveCTE R ON R.ParentID = MT.ID AND R.Loop = 0
)

SELECT * FROM RecursiveCTE 
    WHERE ParentID IS NULL OR Loop = 1

此查询应返回所有“最后一个子”行以及完整的父链。该列Loop0如果没有循环,1如果有循环。

于 2015-07-31T07:19:49.220 回答
5

您可以使用 Knuth 描述的相同方法在此处检测链表中的循环。在一个列中,跟踪孩子、孩子的孩子、孩子的孩子的孩子等。在另一列中,跟踪孙辈、孙辈的孙辈、孙辈的孙辈的孙辈等。

对于初始选择,列ChildGrandchild列之间的距离为 1。每个选择 fromunion all的深度增加Child1,深度增加Grandchild2。它们之间的距离增加 1。

如果你有任何循环,因为距离每次只增加 1,在循环中的某个点之后Child,距离将是循环长度的倍数。发生这种情况时,ChildGrandchild列是相同的。使用它作为停止递归的附加条件,并在其余代码中将其检测为错误。

SQL Server 示例:

declare @LinkTable table (Parent int, Child int);
insert into @LinkTable values (1, 2), (1, 3), (2, 4), (2, 5), (3, 6), (3, 7), (7, 1);

with cte as (
    select lt1.Parent, lt1.Child, lt2.Child as Grandchild
    from @LinkTable lt1
    inner join @LinkTable lt2 on lt2.Parent = lt1.Child
    union all
    select cte.Parent, lt1.Child, lt3.Child as Grandchild
    from cte
    inner join @LinkTable lt1 on lt1.Parent = cte.Child
    inner join @LinkTable lt2 on lt2.Parent = cte.Grandchild
    inner join @LinkTable lt3 on lt3.Parent = lt2.Child
    where cte.Child <> cte.Grandchild
)
select Parent, Child
from cte
where Child = Grandchild;

删除LinkTable导致循环的记录之一,您会发现select不再返回任何数据。

于 2015-07-31T07:42:37.323 回答
5

这是检测邻接列表(父/子关系)中的循环的另一种方法,其中节点只能有一个父级,可以通过对子列的唯一约束来强制执行(id在下表中)。这通过递归查询计算邻接列表的闭包表来工作。它首先将每个节点作为其自己的 0 级祖先添加到闭包表中,然后迭代地遍历邻接表以扩展闭包表。当新记录的子级和祖先级在原始级别零 (0) 以外的任何级别上都相同时,就会检测到循环:

-- For PostgreSQL and MySQL 8 use the Recursive key word in the CTE code:
-- with RECURSIVE cte(ancestor, child, lev, cycle) as (

with cte(ancestor, child, lev, cycle) as (
  select id, id, 0, 0 from Table1
  union all
  select cte.ancestor
       , Table1.id
       , case when cte.ancestor = Table1.id then 0 else cte.lev + 1 end
       , case when cte.ancestor = Table1.id then cte.lev + 1 else 0 end
    from Table1
    join cte
      on cte.child = Table1.PARENT_ID
   where cte.cycle = 0
) -- In oracle uncomment the next line
-- cycle child set isCycle to 'Y' default 'N'
select distinct
       ancestor
     , child
     , lev
     , max(cycle) over (partition by ancestor) cycle
  from cte

给定 Table1 的以下邻接列表:

| parent_id | id |
|-----------|----|
|    (null) |  1 |
|    (null) |  2 |
|         1 |  3 |
|         3 |  4 |
|         1 |  5 |
|         2 |  6 |
|         6 |  7 |
|         7 |  8 |
|         9 | 10 |
|        10 | 11 |
|        11 |  9 |

上述查询适用于 SQL Sever(以及按照指示修改后的 Oracle、PostgreSQL 和 MySQL 8)正确地检测到节点 9、10 和 11 参与长度为 3 的循环。

SQL(/DB) Fiddles 在各种 DB 中展示了这一点,可以在下面找到:

于 2019-03-13T19:27:19.743 回答
1

尝试限制递归结果

WITH EMP_CTE AS
( 

    SELECT 
        0 AS [LEVEL],   
        ManagerId, EmployeeId, Name
    FROM Employees
    WHERE ManagerId IS NULL

    UNION ALL

    SELECT 
        [LEVEL] + 1 AS [LEVEL],
        ManagerId, EmployeeId, Name
    FROM Employees e
    INNER JOIN EMP_CTE c ON e.ManagerId = c.EmployeeId 
 AND s.LEVEL < 100 --RECURSION LIMIT
) 

    SELECT  * FROM EMP_CTE WHERE [Level] = 100
于 2017-09-20T09:47:58.933 回答
1

这是 SQL Server 的解决方案:

表格插入脚本:

CREATE TABLE MyTable
(
    [ID] INT,
    [ParentID] INT,
    [Name] NVARCHAR(255)
);

INSERT INTO MyTable
(
    [ID],
    [ParentID],
    [Name]
)
VALUES
(1, NULL, 'A root'),
(2, NULL, 'Another root'),
(3, 1, 'Child of 1'),
(4, 3, 'Grandchild of 1'),
(5, 4, 'Great grandchild of 1'),
(6, 1, 'Child of 1'),
(7, 8, 'Child of 8'),
(8, 7, 'Child of 7'), -- This will cause infinite recursion
(9, 1, 'Child of 1');

查找罪魁祸首的确切记录的脚本:

;WITH RecursiveCTE
AS (
   -- Get all parents: 
   -- Any record in MyTable table could be an Parent
   -- We don't know here yet which record can involve in an infinite recursion.
   SELECT ParentID AS StartID,
          ID,
          CAST(Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
   FROM MyTable
   UNION ALL

   -- Recursively try finding all the childrens of above parents
   -- Keep on finding it until this child become parent of above parent.
   -- This will bring us back in the circle to parent record which is being
   -- keep in the StartID column in recursion
   SELECT RecursiveCTE.StartID,
          t.ID,
          CAST(RecursiveCTE.[ParentChildRelationPath] + ' -> ' + t.Name AS NVARCHAR(255)) AS [ParentChildRelationPath]
   FROM RecursiveCTE
       INNER JOIN MyTable AS t
           ON t.ParentID = RecursiveCTE.ID
   WHERE RecursiveCTE.StartID != RecursiveCTE.ID)

-- FInd the ones which causes the infinite recursion
SELECT StartID,
       [ParentChildRelationPath],
       RecursiveCTE.ID
FROM RecursiveCTE
WHERE StartID = ID
OPTION (MAXRECURSION 0);

上述查询的输出:

在此处输入图像描述

于 2019-02-18T21:54:12.610 回答