1

我有两个通过第三个连接表多对多相关的表:products 和 categories。每个产品可以属于多个类别。这是典型的多对多关系:

products
-------------
id
product_name


categories
-------------
id
category_name


products_to_categories
-------------
product_id
caregory_id

我想让用户搜索产品,这些产品在某些选定的类别中,而不是同时在其他选定的类别中。

示例:查找属于“计算机”和“软件”类别但不在“游戏”、“编程”和“教育”类别中的所有产品。

这是我为此设计的查询:

SELECT product_name
FROM products
WHERE
    EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 1 AND product_id = products.id) 
    AND EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 2 AND product_id = products.id) 
    AND NOT EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 3 AND product_id = products.id)
    AND NOT EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 4 AND product_id = products.id) 
    AND NOT EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 5 AND product_id = products.id)
ORDER BY id

有用。但它非常慢,以至于我无法在生产中使用它。所有索引都已就位,但此查询会产生 5 个相关子查询,并且表很大。

有没有办法在没有依赖子查询的情况下解决相同的任务或以其他方式优化此查询?

更新

索引是:

products: PRIMARY KEY (id)
categories: PRIMARY KEY (id)
products_to_categories: PRIMARY KEY (product_id, caregory_id)

所有表都是 InnoDB

4

4 回答 4

2

请发布表的定义(以便显示使用的引擎和定义的索引)。

您还可以发布查询的执行计划(使用EXPLAIN语句)。

您还可以尝试以各种方式重写查询。这是一个:

SELECT p.product_name
FROM products  AS p
  JOIN products_to_categories  AS pc1
    ON pc1.category_id = 1 
    AND pc1.product_id = p.id
  JOIN products_to_categories  AS pc2
    ON  pc2.category_id = 2 
    AND pc2.product_id = p.id
WHERE
    NOT EXISTS 
    ( SELECT * 
      FROM products_to_categories  AS pc 
      WHERE pc.category_id IN (3, 4, 5)
        AND pc.product_id = p.id
    )

更新:您没有(category_id, product_id)索引。尝试添加它。

于 2012-02-11T18:02:18.440 回答
0
SELECT product_name
FROM products
-- we can use an inner join as an optimization, as some categories MUST exist
INNER JOIN products_to_categories ON products.product_id=products_to_categories.product_id
WHERE 
  products_to_categories.category_id NOT IN (3,4,5) -- substitute unwanted category IDs
  AND EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 1 AND product_id = products.id) 
  AND EXISTS (SELECT product_id FROM products_to_categories WHERE category_id = 2 AND product_id = products.id) 
于 2012-02-11T17:33:54.547 回答
0

我删除了我的答案,因为其他答案更全面。只是一般提示。要减少语句中 AND 的数量,您可以使用 IN 运算符来检查多个类别

where category_id IN(1,2)

或者

where category_id NOT IN(1,2)
于 2012-02-11T17:35:07.777 回答
0

我认为您希望避免使用这些in子句,因为 SQL Server 将执行多个查询或执行“或”,这将比我在下面粘贴的效率低,因为它可能无法利用索引。

您还可以摆脱 #product_categories_filtered 临时表并在一个大查询中完成所有操作,并根据需要使用别名子查询。您可能想尝试不同的配置,看看哪个最好,但临时表在我的应用程序中从来都不是性能问题,除非有人尝试查询具有数千万条记录的内容。我使用 #product_categories_filtered 是因为在某些情况下,当您将查询分解以使用更少的连接时,SQL 服务器查询会运行得更好,尤其是在像您这样的较大表上product

create table #includes (category_id int not null primary key)
create table #excludes (category_id int not null primary key)

insert #includes (category_id) 
    select 1
    union all select 2
insert #excludes (category_id) 
    select 3
    union all select 4
    union all select 5

select 
  pc.product_id
into #product_catories_filtered
from 
  product_categories pc
  join #includes i 
    on pc.category_id = i.category_id
  left join #excludes e 
    on pc.category_id = i.category_id
where 
  e.category_id is null


select distinct
  p.product_name
from 
  #product_categories_filtered pc
  join products p
    on pc.product_id = p.id
order by 
  p.id
于 2012-02-11T17:45:53.093 回答