0

我在一个表中有以下数据,我想报告这些数据而不必删除任何行。

ActiveSearchID---SearchDate---------SearchPhrase
1---------------------2010 -12-15 12:01:11.587---argos
2---------------------2010-12-15 12:03:40.193---muji
3 ----------2010-12-15 12:03:42.370---无印良品
4-------------- -------2010-12-15 12:04:29.167---办公用品
5---------------------2010-12-15 12 :05:11.590---熔岩
9---------2010-12-15 12:08:38.920---索尼 vaio
10---- ---------------2010-12-15 12:08:41.170---索尼vaio
12------- 2010-12-15 12:09:09.920---索尼vaio电池
13-------2010-12-15 12:09:17.487---索尼vaio 电池
14----------2010-12-15 12:17:10.980---sony vaio 电池
15----------- --------2010-12-15 12:17:12.170---阿尔戈斯

我试图获得的报告是选择在 5 分钟间隔内搜索过的搜索短语的第一个实例。因此,例如查询 no 上述信息将导致以下结果:
SearchDate----------------SearchPhrase
2010-12-15 12:01:11.587---argos
2010-12 -15 12:03:40.193---muji
2010-12-15 12:04:29.167---办公用品
2010-12-15 12:05:11.590---熔岩
2010-12-15 12:08:38.920 ---sony vaio
2010-12-15 12:09:09.920---sony vaio 电池
2010-12-15 12:17:12.170---argos


我已经尝试了以下查询,但我仍然得到重复:

select t1.searchdate, t1.searchphrase from activesearches t1 inner join activesearches t2 on t1.searchphrase = t2.searchphrase and t1.searchdate < t2.searchdate where datediff(s, t1.searchdate, t2.searchdate) <= 300 order by searchdate


我想使用“WITH SearchPhrases AS ()”类型的查询,但我无法理解它。

谢谢

4

1 回答 1

0

我相信鉴于您的测试数据,“sony vaio 电池”应该已经退回了两次。我想出了两个选择。

-- Populate test data
if(OBJECT_ID('tempdb..#Search') IS NOT NULL)
    DROP TABLE #Search
create table #Search (
    ActiveSearchID int primary key, 
    SearchDate datetime not null, 
    SearchPhrase nvarchar(30))

insert into #Search(ActiveSearchID, SearchDate, SearchPhrase)
select 1, '2010-12-15 12:01:11.587', 'argos'
union all select 2, '2010-12-15 12:03:40.193', 'muji'
union all select 3, '2010-12-15 12:03:42.370', 'muji'
union all select 4, '2010-12-15 12:04:29.167', 'Office supplies'
union all select 5, '2010-12-15 12:05:11.590', 'lava'
union all select 9, '2010-12-15 12:08:38.920', 'sony vaio'
union all select 10, '2010-12-15 12:08:41.170', 'sony vaio'
union all select 12, '2010-12-15 12:09:09.920', 'sony vaio battery'
union all select 13, '2010-12-15 12:09:17.487', 'sony vaio battery'
union all select 14, '2010-12-15 12:17:10.980', 'sony vaio battery'
union all select 15, '2010-12-15 12:17:12.170', 'argos'

我认为您正在寻找类似此查询的内容。我不确定这将如何执行:

select * 
from #Search as S
where not exists(
select * from #Search as N
where N.SearchPhrase= S.SearchPhrase
and N.SearchDate between 
    dateadd(minute, -5, S.SearchDate) AND S.SearchDate
and N.ActiveSearchID <> S.ActiveSearchID)

或者,如果您可以在时钟上使用谨慎的 5 分钟间隔,这可能会表现得更好——我还没有使用大量数据进行测试:

select
    ActiveSearchID, SearchDate, SearchPhrase
from
(
    select 
        *,
        ROW_NUMBER() over (
                partition by SearchPhrase,  
                             DATEDIFF(minute, '2000-01-01', SearchDate) / 5
            order by SearchDate, ActiveSearchID) as rn,
        DATEDIFF(minute, '2000-01-01', SearchDate) as five_minute_window 
    from #Search
) as X
where
    rn = 1
order by
    ActiveSearchID
于 2011-03-05T05:56:19.267 回答