我有一个包含设备 gps 坐标的 sql 表,每n分钟更新一次(设备安装在车辆中)。鉴于 GPS 的性质,许多条目非常相似,但就服务器而言完全不同。我可以很容易地大致匹配事物(在〜3.6'或36'内)CAST(lat as decimal(7,4))
我希望能够获取结果集并压缩近似的重复条目,但仍保持基于时间的顺序。这是一个例子:
Row Lat Lng vel Hdg Time
01 31.12345 -88.12345 00 00 12-4-21 01:45:00
02 31.12346 -88.12345 00 00 12-4-21 01:46:00
03 31.12455 -88.12410 10 01 12-4-21 01:47:00
04 31.12495 -88.12480 17 01 12-4-21 01:48:00
05 31.12532 -88.12560 22 01 12-4-21 01:49:00
06 31.12567 -88.12608 25 02 12-4-21 01:50:00
07 31.12638 -88.12672 24 02 12-4-21 01:51:00
08 31.12689 -88.12722 19 02 12-4-21 01:52:00
09 31.12345 -88.12345 00 00 12-4-21 01:53:00
10 31.12346 -88.12346 00 00 12-4-21 01:54:00
11 31.12347 -88.12345 00 00 12-4-21 01:55:00
12 31.12346 -88.12346 00 00 12-4-21 01:56:00
13 31.12689 -88.12788 10 40 12-4-21 01:57:00
14 31.12604 -88.12691 13 39 12-4-21 01:58:00
15 31.12572 -88.12603 15 39 12-4-21 01:59:00
我想要的最终结果是将第 1 行和第 2 行压缩为一行,将第 9 行到第 12 行压缩为一行,其中包含AVG(Lat)
、AVG(Lng)
和MIN(Time)
.
鉴于上述数据,这是我希望收到的结果集:
Row Lat Lng vel Hdg Time
01 31.123455 -88.12345 00 00 12-4-21 01:45:00
02 31.12455 -88.12410 10 01 12-4-21 01:47:00
03 31.12495 -88.12480 17 01 12-4-21 01:48:00
04 31.12532 -88.12560 22 01 12-4-21 01:49:00
05 31.12567 -88.12608 25 02 12-4-21 01:50:00
06 31.12638 -88.12672 24 02 12-4-21 01:51:00
07 31.12689 -88.12722 19 02 12-4-21 01:52:00
08 31.12346 -88.123455 00 00 12-4-21 01:53:00
09 31.12689 -88.12788 10 40 12-4-21 01:57:00
10 31.12604 -88.12691 13 39 12-4-21 01:58:00
11 31.12572 -88.12603 15 39 12-4-21 01:59:00
分组之间的界限将是运动。速度 > 0,或 gps 坐标变化超过x量。在这种情况下,x是 0.0001。 如下所述,问题在于给定坐标处的多个停靠点(在不同的时间)被集中到一个停靠点中。如果我今天下午 4 点访问坐标 x,明天上午 8 点,然后在下午 6 点再次访问,我看到的唯一一个是明天@下午 6 点(在 的情况下MAX(Time)
)或今天@下午 4 点(在情况下) MIN(Time)
。
如果速度为 0,则航向也为 0。但是,重要的是,如果第 1 行和第 2 行以及第 9 到 12 行的坐标足够相似以至于相同(即四舍五入到 4小数位)。
我有一个查询可以做到这一点:
SELECT Geography::Point(AVG(dbo.GPSEntries.Latitude),
AVG(dbo.GPSEntries.Longitude),
4326 ) as Location,
dbo.GPSEntries.Velocity,
dbo.GPSEntries.Heading,
MAX(dbo.GPSEntries.Time) as maxTime,
MIN(dbo.GPSEntries.Time) as minTime,
AVG(dbo.RFDatas.RSSI) as avgRSSI,
COUNT(1) as samples
FROM dbo.GPSEntries
INNER JOIN
dbo.Reports ON
dbo.GPSEntries.Report_Id = dbo.Reports.Id
INNER JOIN
dbo.RFDatas ON
dbo.GPSEntries.Report_Id = dbo.RFDatas.Report_Id
GROUP BY CAST(Latitude as Decimal(7,4)),
CAST(Longitude as Decimal(7,4)),
Velocity,
Heading
ORDER BY MAX(Time)
换句话说,如果我从 A 点旅行到 B 点,停留 30 分钟(每分钟 1 次报告 30 份报告),然后前往 C 点,停留 20 分钟,然后返回 B 点并再停留 20 分钟在前往 D 点前几分钟,我希望能够在 B 点看到两个单独的站点。
这是来自我的数据库的一些实际数据,经过消毒以保护无辜者或责怪阿拉巴马州东北部的某个人。
Latitude Longitude Spd Vel MAX(Time) MIN(Time) sig RowCount
34.747420 -86.302580 68 157 2012-06-13 01:31:37.000 2012-06-13 01:31:37.000 -91 1
34.759140 -86.307620 61 134 2012-06-13 01:33:06.000 2012-06-13 01:33:06.000 -91 2
34.763237 -86.307264 0 0 2012-06-13 01:34:36.000 2012-06-12 01:27:21.000 -97 7
34.763288 -86.307280 0 0 2012-06-13 14:30:44.000 2012-06-12 01:30:21.000 -98 527
34.760220 -86.308200 38 110 2012-06-13 14:33:44.000 2012-06-13 14:33:44.000 -98 1
34.750350 -86.305750 5 90 2012-06-13 14:35:13.000 2012-06-13 14:35:13.000 -83 2
34.737160 -86.298040 70 88 2012-06-13 14:36:43.000 2012-06-13 14:36:43.000 -80 1
34.736420 -86.277270 120 33 2012-06-13 14:38:13.000 2012-06-13 14:38:13.000 -87 2
34.747090 -86.248370 120 37 2012-06-13 14:39:43.000 2012-06-13 14:39:43.000 -93 2
34.755620 -86.240640 70 179 2012-06-13 14:41:13.000 2012-06-13 14:41:13.000 -81 1
34.771240 -86.242760 70 0 2012-06-13 14:42:42.000 2012-06-13 14:42:42.000 -88 2
34.785510 -86.245710 70 6 2012-06-13 14:44:12.000 2012-06-13 14:44:12.000 -99 2
34.800220 -86.239400 70 1 2012-06-13 14:45:42.000 2012-06-13 14:45:42.000 -86 1
34.815070 -86.232180 70 16 2012-06-13 14:47:12.000 2012-06-13 14:47:12.000 -98 2
34.824540 -86.226198 0 0 2012-06-13 14:51:41.000 2012-06-13 00:13:48.000 -101 9
34.824579 -86.226171 0 0 2012-06-14 00:26:19.000 2012-06-12 00:46:57.000 -99 168
您会注意到第 4 行和最后一行分别有 527 和 168 个条目,它们跨越 2 天。这些条目仅来自 1 个设备,并且来自设备在同一地点多次停止数小时的位置。
这是一些压缩的 csv 数据:示例
我最后做了什么
对 Aaron Bertrand 提供的查询进行了一些小的修改,如下所示:
WITH d AS
(
SELECT Time
,Latitude
,Longitude
,Velocity
,Heading
,TimeRN = ROW_NUMBER() OVER (ORDER BY [Time])
FROM dbo.GPSEntries
GROUP BY Time, Latitude, Longitude, Velocity, Heading
),
y AS (
SELECT BeginTime = MIN(Time)
,EndTime = MAX(Time)
,Latitude = AVG(Latitude)
,Longitude = AVG(Longitude)
-- ,[RowCount] = COUNT(*)
,GroupNumber
FROM (
SELECT Time
,Latitude
,Longitude
,GroupNumber = (
SELECT MIN(d2.TimeRN)
FROM d AS d2
WHERE d2.TimeRN >= d.TimeRN AND
NOT EXISTS (
SELECT 1
FROM d AS d3 -- Between 250 and 337 feet
WHERE ABS(d2.Latitude - d.Latitude) <= .0007 AND
ABS(d2.Longitude - d.Longitude) <= .0007 AND
d2.Velocity = d.Velocity ) )
FROM d ) AS x
GROUP BY GroupNumber
)
SELECT y.Latitude
,y.Longitude
,d.Velocity
,d.Heading
,y.BeginTime
-- ,y.EndTime
-- ,y.[RowCount]
-- ,Duration = CONVERT(time(0),DATEADD(SS,DATEDIFF(SS,y.BeginTime, y.EndTime), '0:00:00'), 108)
FROM y INNER JOIN d ON y.BeginTime = d.[Time]
-- FOR STOPS (5 minute):
-- WHERE DATEDIFF(MI, Y.BeginTime, y.EndTime) + 1 > 5
ORDER BY y.BeginTime;