2

这是我的数据的简化版本:

products:
+----+-----------+
| id | name      |
+----+-----------+
|  1 | Product X |
|  2 | Product Y |
|  3 | Product Z |
+----+-----------+

categories:
+----+---------------+
| id | name          |
+----+---------------+
|  1 | Hotel         |
|  2 | Accommodation |
+----+---------------+

category_product
+----+------------+-------------+
| id | product_id | category_id |
+----+------------+-------------+
|  1 |          1 |           1 |
|  2 |          1 |           2 |
|  3 |          2 |           1 |
|  4 |          3 |           2 |
+----+------------+-------------+

如何构建一个有效的查询,该查询仅检索products与“酒店”和“住宿”相关的类别(例如产品 X)?

我首先尝试了一种加入方法

SELECT *
FROM products p
JOIN category_product cp
ON p.id = cp.product_id
WHERE cp.category_id = 1 OR cp.category_id = 2

^ 这不起作用,因为它不会将查询限制为同时包含.

我找到了一种使用子查询的方法......但出于性能原因,我已被警告不要使用子查询:

SELECT *
FROM products p
WHERE
(
    SELECT id
    FROM category_product
    WHERE product_id = p.id
    AND category_id = 1
)
AND
(
    SELECT id
    FROM category_product
    WHERE product_id = p.id
    AND category_id = 2
)

有没有更好的解决方案(或替代方案如何)?我考虑过将类别反规范化为产品的额外列,但理想情况下希望避免这种情况。希望有一个神奇的子弹解决方案!

更新

我已经运行了答案中提供的一些(很棒的)解决方案:我的数据是 235 000 个 category_product 行和 58 000 个产品,显然基准总是依赖于环境和索引等。

“关系划分”@podiluska

2 categories: 2826 rows  ~ 20ms 
5 categories: 46 rows ~ 25-30 ms 
8 categories: 1 rows ~ 25-30 ms 

“存在的地方”@Tim Schmelter

2 categories: 2826 rows  ~ 5-7ms 
5 categories: 46 rows ~ 30 ms 
8 categories: 1 rows ~ 300 ms 

人们可以看到结果随着类别数量的增加而开始出现分歧。我将研究使用“关系除法”,因为它提供了一致的结果,但实现也可能让我查看“存在的地方”(长格式http ://pastebin.com/6NRX0QbJ )

4

5 回答 5

4
SELECT p.*
FROM products p
     inner join 
(
    select product_ID
    from category_product
    where category_id in (1,2)
    group by product_id
    having count(distinct category_id)=2
) pc
    on p.id = pc.product_id

这种技术称为“关系除法”

于 2012-10-26T08:50:48.627 回答
0
select *
from products p
where
    (
        select
            count(distinct cp.category_id)
        from category_product as cp
        where
            cp.product_id = p.id and
            cp.category_id in (1, 2)
    ) = 2

或者你可以使用存在

select *
from products p
where
    exists
    (
        select
            count(distinct cp.category_id)
        from category_product as cp
        where
            cp.product_id = p.id and
            cp.category_id in (1, 2)
        having count(distinct cp.category_id) = 2
    )
于 2012-10-26T08:50:57.203 回答
0

我会使用EXISTS

SELECT P.* FROM Products P
WHERE EXISTS
(
    SELECT 1 FROM category_product cp
    WHERE cp.product_id = p.id
    AND category_id = 1
)
AND EXISTS
(
    SELECT 1 FROM category_product cp
    WHERE cp.product_id = p.id
    AND category_id = 2
)
于 2012-10-26T08:51:51.163 回答
0
SELECT categories.name,products.name 
FROM 
category_product,category,product 
where 
    category_product.product_id=product.id 
and 
   category_product.category_id=category.id 
    and 
   (
      select count(1) from category_product 
      where 
      category_product.categoty_id=1
      or 
      category_product.categoty_id=2 
     group by product_id having count(1)=2
   )
于 2012-10-26T09:09:30.443 回答
-1
SELECT p.id
FROM products p
JOIN category_product cp
ON p.id = cp.product_id
WHERE cp.category_id IN (1,2)
GROUP BY p.id
HAVING COUNT(DISTINCT cp.category_id) = 2
于 2012-10-26T08:53:41.937 回答