我有一个数据库
books (primary key: bookID)
characterNames (foreign key: books.bookID)
locations (foreign key: books.bookID)
字符名称和位置的文本位置保存在相应的表中。
我正在使用 psycopg2 编写 Pythonscript,查找书中给定字符名称和位置的所有出现。我只想要书中的出现,同时找到角色名称和位置。
在这里,我已经有了一个搜索一个位置和一个字符的解决方案:
WITH b AS (
SELECT bookid
FROM characternames
WHERE name = 'XXX'
GROUP BY 1
INTERSECT
SELECT bookid
FROM locations
WHERE l.locname = 'YYY'
GROUP BY 1
)
SELECT bookid, position, 'char' AS what
FROM b
JOIN characternames USING (bookid)
WHERE name = 'XXX'
UNION ALL
SELECT bookid, position, 'loc' AS what
FROM b
JOIN locations USING (bookid)
WHERE locname = 'YYY'
ORDER BY bookid, position;
CTE 'b' 包含所有 bookid,其中出现了字符名称 'XXX' 和位置 'YYY'。
现在我还想知道搜索 2 个地点和一个名称(或分别搜索 2 个名称和一个地点)。如果所有搜索的实体都必须出现在一本书中,这很简单,但是这又如何:
搜索:Tim, Al, Toolshop 结果:书籍包括
(Tim, Al, Toolshop) 或
(Tim, Al) 或
(Tim, Toolshop) 或
(铝,工具店)
该问题可能会在 4、5、6...条件下重复出现。
我想 INTERSECTing 更多子查询,但这行不通。
相反,我会将找到的 bookID 合并,将它们分组并选择 bookid 的出现不止一次:
WITH b AS (
SELECT bookid, count(bookid) AS occurrences
FROM
(SELECT DISTINCT bookid
FROM characterNames
WHERE name='XXX'
UNION
SELECT DISTINCT bookid
FROM characterNames
WHERE name='YYY'
UNION
SELECT DISTINCT bookid
FROM locations
WHERE locname='ZZZ'
GROUP BY bookid)
WHERE occurrences>1)
我认为这可行,目前无法测试,但这是最好的方法吗?