0

我的问题域是广告,为此,我有一个数据库,其中包含一个名为ADVERT. 广告可以具有方面(即准分类描述性术语)。因此,有一个FACET定义方面的FACETTERM表和一个包含每个方面的值的表。 ADVERTFACETTERMASSIGNMENT是链接表,它说明了哪些方面术语值分配给了哪个广告。

因此,您可能有一个汽车广告,其“制造”方面的值为“本田”,“位置”方面的值为“苏塞克斯”。因此,如果 advert 是 advert {PK = 14},而 Honda 是 facet-term{PK = 1}而 Sussex 是 facet-term {PK = 2},那么您希望在ADVERTFACETTERMASSIGNMENT { AdvertId, FacetTermId }: 14, 1and中行14, 2

鉴于这种安排,我如何才能找到所有其他针对苏塞克斯本田的广告?换句话说,我如何找到一组与ADVERTFACETTERMASSIGNMENT给定广告的表中的行匹配但不是来自该广告的行?

我正在使用 SQL Server 2008。我尝试使用 IN 子句,但这会返回部分匹配项,即苏塞克斯以外的所有本田汽车以及苏塞克斯的所有非本田汽车等。

为了重申我的要求,我需要找到所有行,ADVERTFACETTERMASSIGNMENT其中这些行至少包含与另一个给定广告相同的 facet-term id。它是否具有更多方面术语并不重要,只要它至少具有与所选、比较器、广告完全相同的方面术语。

4

3 回答 3

2

这基本上是一个 EAV - 实体、属性、值模型,具有固定的值选择。

WITH FLATTENED AS (
    SELECT a.ADVERT_ID, ft.FACETTERM_ID
    FROM ADVERT a
    INNER JOIN ADVERTFACETTERMASSIGNMENT afta
        ON afto.ADVERT_ID = a.ADVERT_ID
    INNER JOIN FACETTERM ft
        ON ft.FACETTERM_ID = afta.FACETTERM_ID
    INNER JOIN FACET f
        ON f.FACET_ID = ft.FACET_ID
)
SELECT rhs.ADVERT_ID, COUNT(*)
FROM FLATTENED lhs
INNER JOIN FLATTENED rhs
    ON lhs.ADVERT_ID = @SOME_ID
    AND rhs.ADVERT_ID <> lhs.ADVERT_ID
    AND rhs.FACETTERM_ID = lhs.FACETTERM_ID
GROUP BY rhs.ADVERT_ID
HAVING COUNT(*) = (SELECT COUNT(*) FROM FLATTENED WHERE ADVERT_ID = @SOME_ID)

这里的技术是任何两个广告之间的内部连接中匹配的方面的数量必须等于左侧对象广告的方面的数量。

于 2012-06-06T15:13:21.007 回答
0

一种方法是在 facettermid 上将 advertfacettermassignment 表内部连接到其自身,并按 advertid 分组,以计算广告之间匹配方面的数量。

然后,您可以将其与第一个广告中的构面总数进行比较,如果匹配数相同,则它至少具有与所选比较器完全相同的构面术语。

在 SQL SERVER 2008 中,您可以使用 CTE 来简化此操作。像这样:

;WITH m AS
    (SELECT advertid,candidateid,COUNT(*) as matchingfacets FROM (
        SELECT a.advertid,b.advertid as candidateid FROM advertfacettermassignment a 
        INNER JOIN advertfacettermassignment b ON a.facettermid=b.facettermid) sub
    GROUP BY advertid,candidateid)
,t AS
    (SELECT advertid,COUNT(*) as TotalFacets FROM advertfacettermassignment GROUP BY advertid)
SELECT 
    totalfacets.advertid,
    matchingfacets.candidateid,
    t.totalfacets,
    m.matchingFacets
FROM m INNER JOIN t
ON m.advertid=t.advertid
WHERE matchingfacets=totalfacets
于 2012-06-06T15:13:48.077 回答
0

好的,其中 ADVERTFACETTERMASSIGNMENT 是 {AdvertId, FacetTermId} 用于两个词搜索...

select fta1.AdvertID 
from ADVERTFACETTERMASSIGNMENT fta1
join ADVERTFACETTERMASSIGNMENT fta2 on fta1.AdvertID = fta2.AdvertID
where fta1.FacetTermId = @searchFacet1 and fta2.FacetTermID = @searchFacet2
and fta1.AdvertID <> @searchAdvertId

工作示例的一般答案:

declare @AdvertFacetTermAssignment table (AdvertId int, FacetTermId int)

insert into @AdvertFacetTermAssignment values
(1,10), (1,11), (2,10), (3,11), (4,10), (4,11), (5,10), (5,11), (5,12), (6,10), (6,12), (6,13), (7, 10), (7, 11), (8, 12), (9,10), (9,11), (9,12), (10, 10), (10,12)

declare @searchAdvertId int = 1
declare @targetMatch int = (select COUNT(*) from @AdvertFacetTermAssignment where AdvertId = @searchAdvertId)

select aft2.AdvertId from @AdvertFacetTermAssignment aft1 
join @AdvertFacetTermAssignment aft2
on aft1.FacetTermId = aft2.FacetTermId and aft1.AdvertId <> aft2.AdvertId
where aft1.AdvertId = @searchAdvertId
group by aft2.AdvertId 
having COUNT(*) = @targetMatch

结果 = 4,5,7,9

最后一个不是所要求的,而是抓取所有相似的东西(一些匹配的方面)并按相似程度排序。
(所有比赛一视同仁)

 select aft2.AdvertId, COUNT(aft1.AdvertId) as matches, ABS(COUNT(*)-@targetMatch) as nonMatches 
 from @AdvertFacetTermAssignment aft1
 right outer join @AdvertFacetTermAssignment aft2
 on aft2.FacetTermId = aft1.FacetTermId 
    and aft1.AdvertId = @searchAdvertId
    and aft2.AdvertId <> @searchAdvertId
 group by aft2.AdvertId 
 having COUNT(aft1.AdvertId) > 0
 order by COUNT(aft1.AdvertId) DESC, ABS(COUNT(*)-@targetMatch) ASC 

结果:

 AdvertId matches nonMatches
 4        2       0
 7        2       0
 9        2       1
 5        2       1
 10       1       0
 6        1       1
 2        1       1
 3        1       1

(顺便说一句,我是从威斯康星州苏塞克斯发布的)

于 2012-06-06T14:35:18.490 回答