8

I have a hierarchical table in MySQL: parent field of each item points to the id field of its parent item. For each item I can get the list of all its parents [regardless the depth] using the query described here. With GROUP_CONCAT I get the full path as a single string:

SELECT GROUP_CONCAT(_id SEPARATOR ' > ') FROM (
SELECT  @r AS _id,
         (
         SELECT  @r := parent
         FROM    t_hierarchy
         WHERE   id = _id
         ) AS parent,
         @l := @l + 1 AS lvl
 FROM    (
         SELECT  @r := 200,
                 @l := 0
         ) vars,
         t_hierarchy h
WHERE    @r <> 0
ORDER BY lvl DESC
) x

I can make this work only if the id of the item is fixed [it's 200 in this case].

I want to do the same for all rows: retrieve the whole table with one additional field (path) which will display the full path. The only solution that comes to my mind is to wrap this query in another select, set a temporary variable @id and use it inside the subquery. But it doesn't work. I get NULLs in the path field.

SELECT @id := id, parent, (
    SELECT GROUP_CONCAT(_id SEPARATOR ' > ') FROM (
    SELECT  @r AS _id,
             (
             SELECT  @r := parent
             FROM    t_hierarchy
             WHERE   id = _id
             ) AS parent,
             @l := @l + 1 AS lvl
     FROM    (
             SELECT  @r := @id,
                     @l := 0
             ) vars,
             t_hierarchy h
    WHERE    @r <> 0
    ORDER BY lvl DESC
    ) x
) as path
 FROM t_hierarchy

P.S. I know I can store the paths in a separate field and update them when inserting/updating, but I need a solution based on the linked list technique.

UPDATE: I would like to see a solution that will not use recursion or constructs like for and while. The above method for finding paths doesn't use any loops or functions. I want to find a solution in the same logic. Or, if it's impossible, please try to explain why!

4

2 回答 2

2

定义 getPath 函数并运行以下查询:

select id, parent, dbo.getPath(id) as path from t_hierarchy 

定义 getPath 函数:

create function dbo.getPath( @id int)
returns varchar(400)
as
begin
declare @path varchar(400)
declare @term int
declare @parent varchar(100)
set @path = ''
set @term = 0
while ( @term <> 1 )
begin
   select @parent = parent from t_hierarchy where id = @id
   if ( @parent is null or @parent = '' or  @parent = @id )
        set @term = 1
   else
        set @path = @path + @parent   
   set @id = @parent     
end
return @path
end
于 2012-08-29T05:10:41.520 回答
2

考虑以下两个查询之间的区别:

SELECT @id := id as id, parent, (
    SELECT concat(id, ': ', @id)
) as path
FROM t_hierarchy;

SELECT @id := id as id, parent, (
    SELECT concat(id, ': ', _id)
    FROM (SELECT @id as _id) as x
) as path
FROM t_hierarchy;

它们看起来几乎相同,但给出的结果却截然不同。在我的 MySQL 版本中,_id在第二个查询中,其结果集中的每一行都是相同的,并且等于id最后一行的。但是,最后一点是正确的,因为我按照给定的顺序执行了两个查询;之后SET @id := 1,例如,我可以看到它_id总是等于SET语句中的值。

那么这里发生了什么?产生一个EXPLAIN线索:

mysql>     explain SELECT @id := id as id, parent, (
    ->         SELECT concat(id, ': ', _id)
    ->         FROM (SELECT @id as _id) as x
    ->     ) as path
    ->     FROM t_hierarchy;
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
| id | select_type        | table       | type   | possible_keys | key              | key_len | ref  | rows | Extra          |
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
|  1 | PRIMARY            | t_hierarchy | index  | NULL          | hierarchy_parent | 9       | NULL | 1398 | Using index    |
|  2 | DEPENDENT SUBQUERY | <derived3>  | system | NULL          | NULL             | NULL    | NULL |    1 |                |
|  3 | DERIVED            | NULL        | NULL   | NULL          | NULL             | NULL    | NULL | NULL | No tables used |
+----+--------------------+-------------+--------+---------------+------------------+---------+------+------+----------------+
3 rows in set (0.00 sec)

第三行,DERIVED没有使用表的表,向 MySQL 表明它可以在任何时候只计算一次。服务器没有注意到派生表使用了查询中其他地方定义的变量,并且不知道您希望它每行运行一次。您被 MySQL 文档中关于用户定义变量的行为所困扰:

作为一般规则,您永远不应为用户变量赋值并在同一语句中读取该值。您可能会得到预期的结果,但这不能保证。涉及用户变量的表达式的求值顺序是未定义的,并且可能会根据给定语句中包含的元素而改变;此外,不保证此顺序在 MySQL 服务器的版本之间是相同的。

在我的情况下,它选择先计算该表,然后@id由外部(重新)定义SELECT。事实上,这正是原始分层数据查询起作用的原因;定义是由 MySQL 在查询中的任何其他内容之前计算的@r,正是因为它是那种派生表。但是,我们需要一种方法来为@r每个表行重置一次,而不仅仅是为整个查询重置一次。为此,我们需要一个看起来像原始查询的查询,@r手动重置。

SELECT  @r := if(
          @c = th1.id,
          if(
            @r is null,
            null,
            (
              SELECT  parent
              FROM    t_hierarchy
              WHERE   id = @r
            )
          ),
          th1.id
        ) AS parent,
        @l := if(@c = th1.id, @l + 1, 0) AS lvl,
        @c := th1.id as _id
FROM    (
        SELECT  @c := 0,
                @r := 0,
                @l := 0
        ) vars
        left join t_hierarchy as th1 on 1
        left join t_hierarchy as th2 on 1
HAVING  parent is not null

此查询使用与t_hierarchy原始查询相同的第二个查询,以确保结果中有足够的行供父子查询循环。它还为每个包含自己作为父项的 _id 添加一行;否则,任何根对象(NULL在父字段中)都不会出现在结果中。

奇怪的是,运行结果GROUP_CONCAT似乎会破坏排序。幸运的是,该函数有自己的ORDER BY子句:

SELECT  _id,
        GROUP_CONCAT(parent ORDER BY lvl desc SEPARATOR ' > ') as path,
        max(lvl) as depth
FROM    (
  SELECT  @r := if(
            @c = th1.id,
            if(
              @r is null,
              null,
              (
                SELECT  parent
                FROM    t_hierarchy
                WHERE   id = @r
              )
            ),
            th1.id
          ) AS parent,
          @l := if(@c = th1.id, @l + 1, 0) AS lvl,
          @c := th1.id as _id
  FROM    (
          SELECT  @c := 0,
                  @r := 0,
                  @l := 0
          ) vars
          left join t_hierarchy as th1 on 1
          left join t_hierarchy as th2 on 1
  HAVING  parent is not null
  ORDER BY th1.id
) as x
GROUP BY _id;

公平警告:这些查询隐含地依赖于更新之前发生的@r更新。MySQL 不保证该顺序,并且可能随服务器的任何版本而改变。@l@c

于 2012-08-29T18:36:12.220 回答