5

我有一个数据框,其中包含有关驾驶员及其遵循的路线的数据。我想弄清楚总行驶里程。我正在使用该geosphere软件包,但无法找出正确的应用方法并以英里为单位获得答案。

> head(df1)
  id       routeDateTime driverId      lat       lon
1  1 2012-11-12 02:08:41      123 76.57169 -110.8070
2  2 2012-11-12 02:09:41      123 76.44325 -110.7525
3  3 2012-11-12 02:10:41      123 76.90897 -110.8613
4  4 2012-11-12 03:18:41      123 76.11152 -110.2037
5  5 2012-11-12 03:19:41      123 76.29013 -110.3838
6  6 2012-11-12 03:20:41      123 76.15544 -110.4506

到目前为止我已经尝试过

spDists(cbind(df1$lon,df1$lat))

和其他几个功能,但似乎无法得到合理的答案。

有什么建议么?

> dput(df1)
structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 
29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40), routeDateTime = c("2012-11-12 02:08:41", 
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", 
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", 
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41", 
"2012-11-12 02:08:41", "2012-11-12 02:09:41", "2012-11-12 02:10:41", 
"2012-11-12 03:18:41", "2012-11-12 03:19:41", "2012-11-12 03:20:41", 
"2012-11-12 03:21:41", "2012-11-12 12:08:41", "2012-11-12 12:09:41", 
"2012-11-12 12:10:41", "2012-11-12 02:08:41", "2012-11-12 02:09:41", 
"2012-11-12 02:10:41", "2012-11-12 03:18:41", "2012-11-12 03:19:41", 
"2012-11-12 03:20:41", "2012-11-12 03:21:41", "2012-11-12 12:08:41", 
"2012-11-12 12:09:41", "2012-11-12 12:10:41", "2012-11-12 02:08:41", 
"2012-11-12 02:09:41", "2012-11-12 02:10:41", "2012-11-12 03:18:41", 
"2012-11-12 03:19:41", "2012-11-12 03:20:41", "2012-11-12 03:21:41", 
"2012-11-12 12:08:41", "2012-11-12 12:09:41", "2012-11-12 12:10:41"
), driverId = c(123, 123, 123, 123, 123, 123, 123, 123, 123, 
123, 456, 456, 456, 456, 456, 456, 456, 456, 456, 456, 789, 789, 
789, 789, 789, 789, 789, 789, 789, 789, 246, 246, 246, 246, 246, 
246, 246, 246, 246, 246), lat = c(76.5716897079255, 76.4432530414779, 
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 
76.3357809444424, 76.032417796785, 76.5716897079255, 76.4432530414779, 
76.9089707506355, 76.1115217276383, 76.2901271982118, 76.155437662499, 
76.4115052509587, 76.8397977722343, 76.3357809444424, 76.032417796785, 
76.5716897079255, 76.4432530414779, 76.9089707506355, 76.1115217276383, 
76.2901271982118, 76.155437662499, 76.4115052509587, 76.8397977722343, 
76.3357809444424, 76.032417796785), lon = c(-110.80701574916, 
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, 
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, 
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, 
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, 
-110.556956546381, -110.24483308522, -110.217355202651, -110.80701574916, 
-110.75247172825, -110.861284852726, -110.203674311982, -110.383751512505, 
-110.450569844106, -110.22185564111, -110.556956546381, -110.24483308522, 
-110.217355202651, -110.80701574916, -110.75247172825, -110.861284852726, 
-110.203674311982, -110.383751512505, -110.450569844106, -110.22185564111, 
-110.556956546381, -110.24483308522, -110.217355202651)), .Names = c("id", 
"routeDateTime", "driverId", "lat", "lon"), row.names = c(NA, 
-40L), class = "data.frame")
4

3 回答 3

7

这个怎么样?

## Setup
library(geosphere)
metersPerMile <- 1609.34
pts <- df1[c("lon", "lat")]

## Pass in two derived data.frames that are lagged by one point
segDists <- distVincentyEllipsoid(p1 = pts[-nrow(df),], 
                                  p2 = pts[-1,])
sum(segDists)/metersPerMile
# [1] 1013.919

(要使用更快的距离计算算法之一,只需在上面的调用中替换distCosinedistVincentySpheredistHaversinefor distVincentyEllipsoid。)

于 2012-12-21T19:46:47.200 回答
1

对丢失的数据要非常小心,因为 distVincentyEllipsoid() 对于缺少坐标 c(NA, NA)、c(NA, NA) 的任意两点之间的距离返回 0。

于 2013-03-29T03:40:09.180 回答
0
library(geodist)
geodist(df, sequential = TRUE, measure = "geodesic") # sequence of distance increments
sum(geodist(df, sequential = TRUE, measure = "geodesic")) # total distance in metres
sum(geodist(df, sequential = TRUE, measure = "geodesic")) * 0.00062137 # total distance in miles

由于涉及长距离,此处需要测地线距离。结果是 1013.915,与不太准确的文森特距离略有不同geosphere。街道网络距离也可以用

library(dodgr)
dodgr_dists(from = df)

...但必须有一个街道网络,但 (lat = 76, lon = -110) 并非如此。如果有街道网络,默认情况下会为您提供通过街道网络路由的所有成对距离,其中顺序增量是非对角线。

于 2018-12-14T09:52:05.873 回答