mysql - mysql query join: Pages in Sections OR Subsections，是否有更好的替代 GROUP BY 来消除欺骗？

Question

更新：我在生产中使用我的问题中显示的 sql 查询，但是如果您想查看另一种方法，请阅读整个线程，使用带有 UNION 的 sql

我已经试验并制作了一个用于内容搜索的结果集，但我想确保它的性能是最好的。

我有一个名为 SECTIONS 的表，它在邻接列表模型中包含 2 个级别的部分，即级别 1（一个部分）和级别 2（一个子部分）

SECTIONS: id, parent_id, name

我查询该表两次以获取排列中的列

sec_id, sec_name, subsec_id, subsec_name

（这样我就可以创建像 /section_id/subsection_id 这样的 uri 链接）

现在我加入一个名为 PAGES 的单独表，其中一个页面可以通过字段 section_id 与一个部分或一个子部分（两者都不是）相关联

-- columns to return
SELECT
s.id as section_id,
s.name as section_name,
ss.id as subsection_id,
ss.parent_id as subsection_parent_id,
ss.name as subsection_name,
p.section_id as page_section_id,
p.name as page_name

-- join SECTIONS into Sections and SubSections
FROM 
( select id, name from sections where parent_id=0 ) as s

LEFT JOIN
( select id, parent_id, name from sections where parent_id!=0 ) as ss

ON
ss.parent_id = s.id

-- now join to PAGES table
JOIN 
( select id, section_id, name from pages where active=1 ) as p

ON
(
p.section_id = s.id
OR
p.section_id = ss.id 
)
-- need to use GROUP BY to eliminate duplicate pages
GROUP BY p.id

我在结果集中得到了重复的页面，所以我使用 GROUP BY pages.id 来删除重复的页面，但这会稍微降低性能。

你能建议一个更好的方法来消除重复吗？

我已经考虑在 SECTIONS 连接中创建一个包含 Section ID 或 Subsection ID 的列（取决于行的类型 - 部分或子部分），然后使用它与 PAGES section_id 相关联，所以不会有重复的行，但我不知道该怎么做。

谢谢

score 1 · Accepted Answer

您会得到重复的页面，因为您没有区分与级别 1 部分相关的页面与与级别 2 部分相关的页面。相反，将页面分为两个不同的组：

-- pages related to a level-2 section
SELECT
    p.id, p.section_id, p.name,
    l1.id AS section_id, l1.name AS section_name,
    l2.id AS subsection_id, L2.name AS subsection_name
FROM pages AS p
JOIN sections AS l2 ON (
    l2.id = p.section_id AND
    l2.parent_id <> 0
)
JOIN section AS l1 ON (
    l1.id = l2.parent_id
)
WHERE active = 1

UNION

-- pages related to a level-1 section
SELECT
    p.id, p.section_id, p.name,
    l1.id AS section_id, l1.name AS section_name,
    NULL, NULL -- do not join with sub-sections, so as to avoid duplicates
JOIN sections AS p_ss ON (
    p_ss.id = p.section_id AND
    p_ss.parent_id = 0
)
WHERE active = 1

score 0 · Accepted Answer

这会很长:(

请注意，我最终没有使用这种方法，因为它的性能比我最初使用 GROUP BY 的尝试更差

我不得不修改 PAGES 表的数据表设计，以包含一个新列来保存页面所属的小节的 id，因此现在 PAGES 表具有指示其所属节的列，以及小节。该结构修改仅用于测试，我没有在最终版本中使用它。

这是我使用两个查询之间的 UNION 概念创建的查询。

SELECT
* 
FROM
  pages AS p
JOIN
-- create derived table of sections and subsections
  ( -- separate query to get sections (parent id = 0 )
    SELECT 
        s.id AS page_sec_id,
        s.id AS sec_id,
        s.name AS sec_name,
        NULL AS subsec_id,
        NULL AS subsec_name,
        s.parent_id AS parent_id
    FROM
        sections AS s
    WHERE
        s.parent_id = 0
   UNION
    -- separate query to get subsection (parent id != 0)
    SELECT
        ss.id AS page_sec_id,
        ss.parent_id AS sec_id,
        -- need to get section name, so had to use weird subquery
        (SELECT name FROM sections WHERE parent_id =0 AND id = ss.parent_id) AS sec_name,
        ss.id AS subsec_id,
        ss.name AS subsec_name,
        ss.parent_id AS parent_id
    FROM
        sections AS ss
    WHERE
        ss.parent_id != 0
   )  AS sss

ON
    -- specify how PAGES table is joined to this derived table of sections and subsections

    -- pages linked to sections only
        ( p.section_id = sss.sec_id AND p.subsection_id = 0 AND sss.parent_id = 0)
        OR
    -- pages linked to subsections only
        ( p.section_id = sss.sec_id AND p.subsection_id = sss.subsec_id )

此 UNION 查询使用 0.0388 秒 来处理 5 行 Pages 和 4 行部分/子部分，而原始查询使用 0.0017 seconds，所以我在我的问题中坚持使用原始查询。顺便说一句，在我的开发环境中，mysql 在 P3 Katmai 450 Mhz 256 RAM 上运行，以迫使我编写高效的查询:)

感谢您的阅读，如果您有其他想法和意见，请添加它们。

mysql - mysql query join: Pages in Sections OR Subsections，是否有更好的替代 GROUP BY 来消除欺骗？

2 回答 2

Related

Reference