0

我有一个存储蓝牙检测信息的表。例如:

MACaddress         | DetectorID | PollingIntervalStart     | PollingIntervalEnd
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:09.000  | 2012-03-26 16:51:19.000
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:24.000  | 2012-03-26 16:51:28.000
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:35.000  | 2012-03-26 16:51:49.000
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:55.000  | 2012-03-26 16:52:09.000
00:00:00:00:32:11  |    3       | 2012-03-26 17:00:43.000  | 2012-03-26 17:01:19.000
00:00:00:00:20:F1  |    1       | 2012-03-26 17:02:52.000  | 2012-03-26 16:53:02.000
...

00:00:00:00:00:01  |    3       | 2012-03-26 19:21:19.000  | 2012-03-26 19:21:48.000
00:00:00:00:00:01  |    3       | 2012-03-26 19:21:59.000  | 2012-03-26 19:22:51.000
00:00:00:00:00:01  |    3       | 2012-03-26 19:22:19.000  | 2012-03-26 19:22:31.000
00:00:00:00:20:F1  |    1       | 2012-03-26 20:23:49.000  | 2012-03-26 19:50:30.000

detectorID 是轮询设备的蓝牙检测器的 ID。如您所见,有时设备会在检测器的轮询半径内徘徊,因此我们会得到同一设备的检测集群。我想要做的是对集群进行分组并对该集群进行第一次检测(意思是min(DetectionTime))(比如我们将集群定义为在三分钟内多次轮询同一设备)。请注意,检测器的轮询间隔长度不是恒定的。例如对于集群

00:00:00:00:00:01  |    3       | 2012-03-26 16:51:09.000  | 2012-03-26 16:51:19.000 -- take this record
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:24.000  | 2012-03-26 16:51:28.000
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:35.000  | 2012-03-26 16:51:49.000
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:55.000  | 2012-03-26 16:52:09.000

我只想获得第一条记录。如上所述进行分组后,表格应如下所示:

MACaddress         | DetectorID | PollingIntervalStart     | PollingIntervalEnd
00:00:00:00:00:01  |    3       | 2012-03-26 16:51:09.000  | 2012-03-26 16:51:19.000
00:00:00:00:32:11  |    3       | 2012-03-26 17:00:43.000  | 2012-03-26 17:01:19.000
00:00:00:00:20:F1  |    1       | 2012-03-26 17:02:52.000  | 2012-03-26 16:53:02.000
...

00:00:00:00:00:01  |    3       | 2012-03-26 19:21:19.000  | 2012-03-26 19:21:48.000
00:00:00:00:20:F1  |    1       | 2012-03-26 20:23:49.000  | 2012-03-26 19:50:30.000

我尝试使用group by, ROW_NUMBER, RANK, DENSE_RANK,但似乎无法弄清楚。我尝试使用计数表来制作时间间隔并按时间间隔加入,但这没有用。任何帮助表示赞赏。谢谢。

编辑

我所说的“团块”的意思是,如果在短时间内多次检测到同一设备,则将其视为团块。我将该时间间隔定义为 3 分钟。这个间隔长度是任意的,可以是任意分钟,但我只选择了 3 分钟。因此,如果在 3:00:22 和 3:00:34 和 3:01:44 检测到 MAC 地址,则所有三个检测都被视为一个簇。如果在 3:00:22 和 3:07:32 检测到它,则它不是团块。

它必须是第一次检测到团块。如果您有最后一次检测团块的代码,您也可以发布它。也许,我可以尝试使用 ROW_NUMBER 和降序来获得所需的输出。

编辑 2

我更改了 Aaron 的代码,使簇长度不再恒定。代码现在只检查集群分离。因此,任何间隔超过 3 分钟的检测都不会被视为集群。集群的这种新定义使代码更容易。

4

2 回答 2

2

鉴于此示例数据(我已经更正了开始时间>结束时间的行,这似乎不正确):

DECLARE @d TABLE
(
  MACaddress VARCHAR(32), 
  DetectorID INT, 
  PollingIntervalStart DATETIME2(0), 
  PollingIntervalEnd DATETIME2(0)
);

INSERT @d VALUES
('00:00:00:00:00:01',3,'2012-03-26 16:51:09.000','2012-03-26 16:51:19.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:24.000','2012-03-26 16:51:28.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:35.000','2012-03-26 16:51:49.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:55.000','2012-03-26 16:52:09.000'),
('00:00:00:00:32:11',3,'2012-03-26 17:00:43.000','2012-03-26 17:01:19.000'),
('00:00:00:00:20:F1',1,'2012-03-26 17:02:52.000','2012-03-26 16:53:02.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:21:19.000','2012-03-26 19:21:48.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:21:59.000','2012-03-26 19:22:51.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:22:19.000','2012-03-26 19:22:31.000'),
('00:00:00:00:20:F1',1,'2012-03-26 19:49:49.000','2012-03-26 19:50:30.000');

这个想法得到了最后一行。正如我所说,我认为肯定有可能获得,但我必须继续前进。这在 SQL Server 2012 中肯定会更容易,它添加了一系列排名函数。

;WITH x AS 
(
  SELECT *, rn = ROW_NUMBER() OVER 
    (PARTITION BY MacAddress, DetectorID ORDER BY PollingIntervalStart)
  FROM @d
)
SELECT * FROM x 
WHERE NOT EXISTS 
(
  SELECT 1 FROM x AS x2 
  WHERE x2.MACaddress = x.MacAddress
  AND x2.DetectorID = x2.DetectorID
  AND x2.rn = x.rn + 1
  AND x2.PollingIntervalStart <= DATEADD(MINUTE, 3, x.PollingIntervalStart)
)
ORDER BY x.PollingIntervalStart;

结果:

MACaddress         DetectorID  PollingIntervalStart  PollingIntervalEnd   rn
-----------------  ----------  --------------------  -------------------  --
00:00:00:00:00:01  3           2012-03-26 16:51:55   2012-03-26 16:52:09  4
00:00:00:00:32:11  3           2012-03-26 17:00:43   2012-03-26 17:01:19  1
00:00:00:00:20:F1  1           2012-03-26 17:02:52   2012-03-26 16:53:02  1
00:00:00:00:00:01  3           2012-03-26 19:22:19   2012-03-26 19:22:31  7
00:00:00:00:20:F1  1           2012-03-26 19:49:49   2012-03-26 19:50:30  2

另一个想法得到了你想要的结果,但是使用了一个游标。就我个人而言,我认为在某些情况下,光标是完全可以接受的(另请参阅有关 2012 年之前的运行总计的讨论,并记住您应该使用正确的光标选项的警告),但其他人甚至拒绝查看它们。这是否实用取决于您的数据大小;你应该测试。

DECLARE @newTable TABLE
(
  MACaddress VARCHAR(32), 
  DetectorID INT, 
  PollingIntervalStart DATETIME2(0), 
  PollingIntervalEnd DATETIME2(0)
);

DECLARE @PreviousTime DATETIME2(0) = NULL, @ma VARCHAR(32), @de INT, 
  @st DATETIME2(0), @et DATETIME2(0), @rn INT;

DECLARE c CURSOR LOCAL FAST_FORWARD FOR 
  SELECT *, rn = ROW_NUMBER() OVER 
    (PARTITION BY MacAddress, DetectorID ORDER BY PollingIntervalStart)
    FROM @d ORDER BY MacAddress, rn;

OPEN c;

FETCH c INTO @ma, @de, @st, @et, @rn;

WHILE @@FETCH_STATUS = 0
BEGIN
  IF @rn = 1 OR (@rn > 1 AND DATEDIFF(MINUTE, @PreviousTime, @st) > 3)
  BEGIN
    INSERT @newTable SELECT @ma, @de, @st, @et;
  END

  SELECT @PreviousTime = @st;

  FETCH c INTO @ma, @de, @st, @et, @rn;
END

SELECT * FROM @newTable ORDER BY PollingIntervalStart;

CLOSE c; DEALLOCATE c;

结果:

MACaddress         DetectorID  PollingIntervalStart  PollingIntervalEnd
-----------------  ----------  --------------------  -------------------
00:00:00:00:00:01  3           2012-03-26 16:51:09   2012-03-26 16:51:19
00:00:00:00:32:11  3           2012-03-26 17:00:43   2012-03-26 17:01:19
00:00:00:00:20:F1  1           2012-03-26 17:02:52   2012-03-26 16:53:02
00:00:00:00:00:01  3           2012-03-26 19:21:19   2012-03-26 19:21:48
00:00:00:00:20:F1  1           2012-03-26 19:49:49   2012-03-26 19:50:30
于 2013-11-13T17:16:18.890 回答
0

我通过稍微修改Aaron Bertrand 的回答找到了答案

设置表:

DECLARE @d TABLE
(
  MACaddress VARCHAR(32), 
  DetectorID INT, 
  PollingIntervalStart DATETIME2(0), 
  PollingIntervalEnd DATETIME2(0)
);

INSERT @d VALUES
('00:00:00:00:00:01',3,'2012-03-26 16:51:09.000','2012-03-26 16:51:19.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:24.000','2012-03-26 16:51:28.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:35.000','2012-03-26 16:51:49.000'),
('00:00:00:00:00:01',3,'2012-03-26 16:51:55.000','2012-03-26 16:52:09.000'),
('00:00:00:00:32:11',3,'2012-03-26 17:00:43.000','2012-03-26 17:01:19.000'),
('00:00:00:00:20:F1',1,'2012-03-26 17:02:52.000','2012-03-26 16:53:02.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:21:19.000','2012-03-26 19:21:48.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:21:59.000','2012-03-26 19:22:51.000'),
('00:00:00:00:00:01',3,'2012-03-26 19:22:19.000','2012-03-26 19:22:31.000'),
('00:00:00:00:20:F1',1,'2012-03-26 19:49:49.000','2012-03-26 19:50:30.000');

我对 Aaron 的代码做了两处修改。我按降序排列子查询。在这种WHERE NOT EXISTS情况下,我将DATEADD支票替换为DATEDIFF(MINUTE, x2.PollingIntervalStart, x.PollingIntervalStart) < 3.

;WITH x AS 
(
    SELECT 
    *, 
    ROW_NUMBER() OVER 
        (PARTITION BY MacAddress, DetectorID ORDER BY PollingIntervalStart DESC) AS RN
    FROM @d
)
select * from x
WHERE NOT EXISTS 
(
  SELECT 1 FROM x AS x2 
  WHERE x2.MACaddress = x.MacAddress
  AND x2.DetectorID = x2.DetectorID
  AND x2.rn = x.rn + 1
  -- x2.PollingIntervalStart is always less than x.PollingIntervalStart becasue of x2.rn = x.rn + 1 condition
  -- this works because the cte query is ordered in descending order
  AND DATEDIFF(MINUTE, x2.PollingIntervalStart, x.PollingIntervalStart) < 3 
)
ORDER BY x.PollingIntervalStart;

谢谢亚伦。

于 2013-11-13T19:16:22.633 回答