1

鉴于此表结构和示例数据(查询中不应使用 t3,此处仅显示 t1 和 t2 之间的关系):

      t1                 t2                         t3
--------------   -----------------   --------------------------------
| id | value |   | t1key | t3key |   | id | value                   |
|  1 |  2008 |   |     3 |     1 |   |  1 | "New intel cpu in 2010" |
|  2 |  2009 |   |     4 |     1 |   |  2 | "New amd cpu in 2008"   |
|  3 |  2010 |   |     6 |     1 |   |    |                     ... |
|  4 | intel |   |     1 |     2 |   --------------------------------
|  5 |   amd |   |     5 |     2 |
|  6 |   cpu |   |     6 |     2 |
|    |   ... |   |       |   ... |
--------------   -----------------

您将如何构建满足以下条件的 SQL 查询:

Given the input for t1.id is the set {6} returns t1.id set {3,4,6,1,5}
Given the input for t1.id is the set {6,4} returns t1.id set {3,4,6}
Given the input for t1.id is the set {5,4} returns t1.id set {}

当桌子更大时不会影响性能......?

4

5 回答 5

2

这是我的杰出贡献(至少让我们假设它现在很棒:)

SELECT DISTINCT a2.t1key, COUNT( * ) AS cnt
FROM t2 AS a1
    LEFT JOIN t2 AS a2 ON a2.t3key = a1.t3key
WHERE a1.t1key IN ( 6, 4 ) 
GROUP BY a2.t3key, a2.t1key
HAVING cnt >=2

IN (6,4)部分真的是不言自明的。cnt >=22 是子句中id-s的数量。IN例如:您正在使用,IN (6)那么您应该使用cnt >=1.

我不确定>是否需要,但我懒得不要创建更大的数据集来测试:)

于 2011-03-30T09:27:01.170 回答
1

It's not very clear what you want.

I will call table t1 word, call table t3 phrase and call table t2 word is in phrase.

Then I guess you want to find all word.ids that are in a same phrase as a specific set of word.ids. Is that correct?

SELECT DISTINCT t1.id
FROM t1 
  JOIN t2
    ON t1.id = t2.t1key
  JOIN t2 copyt2
    ON copyt2.t3key = t2.t3key 
WHERE copyt2.t1key IN
  (6,4)       --what you want to check here

CORRECTION

Reading Joe's comment and re-reading the question details, I guess you want to find all words that appear in same phrase with ALL words in your specified list.

This looks like a relational division problem:

SELECT DISTINCT t2a.t1key
FROM t2 AS t2a
WHERE NOT EXISTS
  ( SELECT *
    FROM t2 AS t2b
    WHERE t2b.t1key IN (6,4)
      AND NOT EXISTS
      ( SELECT *
        FROM t2 AS t2c
        WHERE t2a.t3key = t2c.t3key
          AND t2c.t1key = t2b.t1key
      )
  )

2nd solution:

SELECT a.t1key
FROM t2 AS a
  JOIN t2 as b
    ON  a.t3key = b.t3key
WHERE b.t1key IN (6,4)       --list you want to check
GROUP BY a.t1key, a.t3key
HAVING COUNT(*) = 2          --size of list
;

3rd solution:

SELECT DISTINCT t1key
FROM t2
WHERE t3key IN
  ( SELECT t3key
    FROM t2
    WHERE t1key IN (6,4)
    GROUP BY t3key
    HAVING COUNT(*) = 2
  )
;

Note: The first (with NON EXISTS) solution has a great difference with the other two:

If you try it with a list that its members do not appear in table t2, say (2) or (2,7), it will show ALL t1key's from t2.

The 2nd and 3rd solutions will show NO keys at all in such a case.

于 2011-03-30T06:52:52.807 回答
1
select distinct t1key
from t2
where t3key in
(
    select t3key from t2 where t1key = 6
    intersect
    select t3key from t2 where t1key = 4
)

==> 3, 4, 6

You would need to add more "intersect" clauses depending on how many items are in your input set.

Tested on SQL Server.

于 2011-03-30T06:52:59.277 回答
1
select distinct t2b.t1key
from 
  t2 t2a
  inner join t2 t2b on t2a.t3key = t2b.t3key
where t2a.t1key in (6, 5) /* or whatever */

Starting on t1 (the keyword), you get all the t3 (expressions) which contain "cpu" (or whatever). You don't need to join t3 directly, you don't need any data from there. Joining t2 a second time you get all other keywords which are contained in the found expressions. You only need to return the t1key's of them.


Correction: If you don't want subqueries, you could create a join for each keyword to search for:

select distinct t2b.t1key
from 
  t2 t2a
  inner join t2 t2b on t2a.t3key = t2b.t3key and t2a.t1key = 6
  inner join t2 t2c on t2a.t3key = t2c.t3key and t2a.t1key = 5
于 2011-03-30T06:53:14.407 回答
0

He there, Are you sure you have chosen the right table structure? It doesn't seem to be normalized - though I don't know exactly what entity each table could represent.

Its important to keep your database design at least in the third normal form (see Wikipedia article

Your queries will be much more natural and easily formulated

于 2011-03-30T06:52:29.547 回答