更简单的例子
让我们尝试一个更简单的示例,这样人们就可以围绕这些概念进行思考,并有一个可以复制并粘贴到 SQL Query Analizer 中的实际示例:
想象一个具有层次结构的节点表:
A
- B
- C
我们可以在 Query Analizer 中开始测试:
CREATE TABLE ##Nodes
(
NodeID varchar(50) PRIMARY KEY NOT NULL,
ParentNodeID varchar(50) NULL
)
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('A', null)
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('B', 'A')
INSERT INTO ##Nodes (NodeID, ParentNodeID) VALUES ('C', 'B')
期望的输出:
ParentNodeID NodeID GenerationsRemoved
============ ====== ==================
NULL A 1
NULL B 2
NULL C 3
A B 1
A C 2
B C 1
现在建议的 CTE 表达式,它的输出不正确:
WITH NodeChildren AS
(
--initialization
SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
FROM ##Nodes
WHERE ParentNodeID IS NULL
UNION ALL
--recursive execution
SELECT P.ParentNodeID, N.NodeID, P.GenerationsRemoved + 1
FROM NodeChildren AS P
INNER JOIN ##Nodes AS N
ON P.NodeID = N.ParentNodeID
)
SELECT ParentNodeID, NodeID, GenerationsRemoved
FROM NodeChildren
实际输出:
ParentNodeID NodeID GenerationsRemoved
============ ====== ==================
NULL A 1
NULL B 2
NULL C 3
注意:如果 SQL Server 2005† CTE 无法完成我在 2000 年之前所做的事情‡,那很好,这就是答案。给出“不可能”作为答案的人将赢得赏金。但我会等几天以确保每个人都同意这是不可能的,因为我无法解决我的问题而无法挽回地给予 250 名声望。
吹毛求疵的角落
†不是 2008 年
‡无需诉诸 UDF*,这是已有的解决方案
*除非您可以在原始问题中看到提高 UDF 性能的方法
原始问题
我有一个节点表,每个节点都有一个指向另一个节点(或 null)的父节点。
举例说明:
1 My Computer
2 Drive C
4 Users
5 Program Files
7 Windows
8 System32
3 Drive D
6 mp3
我想要一个返回所有父子关系的表,以及它们之间的代数
对于所有直接父关系:
ParentNodeID ChildNodeID GenerationsRemoved
============ =========== ===================
(null) 1 1
1 2 1
2 4 1
2 5 1
2 7 1
1 3 1
3 6 1
7 8 1
但是还有祖父母关系:
ParentNodeID ChildNodeID GenerationsRemoved
============ =========== ===================
(null) 2 2
(null) 3 2
1 4 2
1 5 2
1 7 2
1 6 2
2 8 2
还有曾祖父母的关系:
ParentNodeID ChildNodeID GenerationsRemoved
============ =========== ===================
(null) 4 3
(null) 5 3
(null) 7 3
(null) 6 3
1 8 3
所以我可以弄清楚基本的 CTE 初始化:
WITH (NodeChildren) AS
{
--initialization
SELECT ParentNodeID, NodeID AS ChildNodeID, 1 AS GenerationsRemoved
FROM Nodes
}
现在的问题是递归部分。当然,显而易见的答案是行不通的:
WITH (NodeChildren) AS
{
--initialization
SELECT ParentNodeID, ChildNodeID, 1 AS GenerationsRemoved
FROM Nodes
UNION ALL
--recursive execution
SELECT parents.ParentNodeID, children.NodeID, parents.Generations+1
FROM NodeChildren parents
INNER JOIN NodeParents children
ON parents.NodeID = children.ParentNodeID
}
Msg 253, Level 16, State 1, Line 1
Recursive member of a common table expression 'NodeChildren' has multiple recursive references.
生成整个递归列表所需的所有信息都存在于初始 CTE 表中。但如果不允许这样做,我会尝试:
WITH (NodeChildren) AS
{
--initialization
SELECT ParentNodeID, NodeID, 1 AS GenerationsRemoved
FROM Nodes
UNION ALL
--recursive execution
SELECT parents.ParentNodeID, Nodes.NodeID, parents.Generations+1
FROM NodeChildren parents
INNER JOIN Nodes
ON parents.NodeID = nodes.ParentNodeID
}
但这失败了,因为它不仅加入了递归元素,而且一遍又一遍地递归地添加相同的行:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
在 SQL Server 2000 中,我使用用户定义函数 (UDF) 模拟了 CTE:
CREATE FUNCTION [dbo].[fn_NodeChildren] ()
RETURNS @Result TABLE (
ParentNodeID int NULL,
ChildNodeID int NULL,
Generations int NOT NULL)
AS
/*This UDF returns all "ParentNode" - "Child Node" combinations
...even multiple levels separated
BEGIN
DECLARE @Generations int
SET @Generations = 1
--Insert into the Return table all "Self" entries
INSERT INTO @Result
SELECT ParentNodeID, NodeID, @Generations
FROM Nodes
WHILE @@rowcount > 0
BEGIN
SET @Generations = @Generations + 1
--Add to the Children table:
-- children of all nodes just added
-- (i.e. Where @Result.Generation = CurrentGeneration-1)
INSERT @Result
SELECT CurrentParents.ParentNodeID, Nodes.NodeID, @Generations
FROM Nodes
INNER JOIN @Result CurrentParents
ON Nodes.ParentNodeID = CurrentParents.ChildNodeID
WHERE CurrentParents.Generations = @Generations - 1
END
RETURN
END
阻止它爆炸的魔法是限制 where 子句:WHERE CurrentParents.Generations - @Generations-1
如何防止递归 CTE 永远递归?