1

我正在使用 PostgreSQL,并且有一个表,其路径列的类型为ltree.

我要解决的问题是:给定整个树结构,除根外,哪个父级拥有最多的子级。

示例数据如下所示:

path column = ; has a depth of 0 and has 11 children its id is 1824 # dont want this one because its the root
path column = ; has a depth of 0 and has 1 children its id is 1823
path column = 1823; has a depth of 1 and has 1 children its id is 1825
path column = 1823.1825; has a depth of 2 and has 1 children its id is 1826
path column = 1823.1825.1826; has a depth of 3 and has 1 children its id is 1827
path column = 1823.1825.1826.1827; has a depth of 4 and has 1 children its id is 1828
path column = 1824.1925.1955.1959.1972.1991; has a depth of 6 and has 5 children its id is 2001
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2141
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2040
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2054
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2253
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2166
path column = 1824.1925.1955.1959.1972.1991.2001.2054; has a depth of 8 and has 0 children its id is 2205
path column = 1824.1925.1955.1959.1972.1991.2001.2141; has a depth of 8 and has 0 children its id is 2161
path column = 1824.1925.1955.1959.1972.1991.2001.2166; has a depth of 8 and has 1 children its id is 2389
path column = 1824.1925.1955.1959.1972.1991.2001.2166.2389; has a depth of 9 and has 0 children its id is 2402
path column = 1824.1925.1983; has a depth of 3 and has 1 children its id is 2135
path column = 1824.1925.1983.2135; has a depth of 4 and has 0 children its id is 2239
path column = 1824.1926; has a depth of 2 and has 5 children its id is 1942
path column = 1824.1926; has a depth of 2 and has 11 children its id is 1928 # this is the row I am after
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1933
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1989
path column = 1824.1926.1928; has a depth of 3 and has 3 children its id is 2051
path column = 1824.1926.1928; has a depth of 3 and has 0 children its id is 2024
path column = 1824.1926.1928; has a depth of 3 and has 2 children its id is 1988

所以,在这个例子中,id 为1824的行(根)有 11 个孩子,id 为1928的行有 11 个深度为 2 的孩子;这是我追求的那一行。

我是 ltree 和 sql 的新手。

(这是一个修订后的问题,在Ltree find parent with most children postgresql关闭后添加了示例数据)。

4

1 回答 1

6

解决方案

要查找具有最多子节点的节点:

SELECT subpath(path, -1, 1), count(*) AS children
FROM   tbl
WHERE  path <> ''
GROUP  BY 1
ORDER  BY 2 DESC
LIMIT  1;

...并排除根节点:

SELECT *
FROM  (
   SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
   FROM   tbl
   WHERE  path <> ''
   GROUP  BY 1
   ) ct
LEFT   JOIN (
   SELECT tbl_id
   FROM   tbl
   WHERE  path = ''
   ) x USING  (tbl_id)
WHERE  x.tbl_id IS NULL
ORDER  BY children DESC
LIMIT  1

假设根节点有一个空ltree( '') 作为路径。可能是NULL。然后使用path IS NULL...

您示例中的获胜者实际上是2001,有 5 个孩子。

-> SQL小提琴

如何?

  • 使用附加模块subpath(...)提供的功能。ltree

  • 获取路径中具有负偏移量的最后一个节点,它是元素的直接父节点。

  • 计算该父节点出现的频率,排除根节点并取剩余的具有最高计数的节点。

  • 用于ltree2text()从中提取值ltree

  • 如果多个节点具有相同的最多子节点,则在示例中选择任意一个。

测试用例

这是我必须做的工作才能得到一个有用的测试用例(在修剪一些噪音之后):

请参阅SQLfiddle

换句话说:请记住下次提供一个有用的测试用例。

附加列

回复评论。
首先,扩展测试用例:

ALTER TABLE tbl ADD COLUMN postal_code text
              , ADD COLUMN whatever serial;
UPDATE tbl SET postal_code = (1230 + whatever)::text;

看一看:

SELECT * FROM tbl;

只需JOIN在基表中生成父级:

SELECT ct.*, t.postal_code
FROM  (
   SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
   FROM   tbl
   WHERE  path <> ''
   GROUP  BY 1
   ) ct
LEFT   JOIN (
   SELECT tbl_id
   FROM   tbl
   WHERE  path = ''
   ) x USING  (tbl_id)
JOIN  tbl t USING (tbl_id)
WHERE  x.tbl_id IS NULL
ORDER  BY children DESC
LIMIT  1;

于 2013-03-24T22:26:16.917 回答