0

I have the following dataframe (df) and would like to interpolate Lat, Lon coordinates at an equidistant interval (e.g. every 250 m) or time interval (e.g. every 2 min).

> head(df)
   ID Latitude Longitude  trip   date.time
1  1 10.30447 -109.2323    1 2005-01-07 11:25:26
2  1 10.30425 -109.2321    1 2005-01-07 11:25:36
3  1 10.30314 -109.2326    1 2005-01-07 11:25:46
4  1 10.30199 -109.2328    1 2005-01-07 11:25:56
5  1 10.30079 -109.2334    1 2005-01-07 11:26:06
6  1 10.30006 -109.2331    1 2005-01-07 11:26:16

I tried to do this using R package zoo and the following code I found in a similar question posted:

full.time    <- with(df,seq(date.time[1],tail(date.time,1),by=1))
library(zoo)
df.zoo <- zoo(df[,3:4],df$date.time)        # convert to zoo object
result <- na.approx(df.zoo,xout=full.time)  # interpolate; result is also a zoo object
head(result)

However, as my dataframe includes multiple trips (df$trip) of multiple individuals (df$ID), I get the following error message:

> df.zoo <- zoo(df[,3:4],df$date.time)        # convert to zoo object
Warning message:
In zoo(df[, 3:4], df$datetime) :
some methods for “zoo” objects do not work if the index entries in ‘order.by’ are not unique

How can I run above code (in a loop?) accounting for individual trips?

4

1 回答 1

1

您的样本不具有代表性:您要求以 2 分钟为增量进行插值,但数据集的跨度小于 2 分钟。所以在这个例子中,我使用 30 秒。增量。此外,您只提供 1 个 ID/类型组合,因此无法验证这是否符合您的要求。尽管如此,它应该。

做这件事有很多种方法; 我发现 data.table 是最方便的——而且肯定是最快的。

df$date.time <- as.POSIXct(df$date.time)  # make sure date.time is POSIXct
library(data.table)
interp.time <- function(var,dt) approx(dt,var,xout=seq(min(dt),max(dt),by="30 sec"))$y
result <- setDT(df)[,lapply(.SD,interp.time,dt=date.time), 
                     by=list(ID,trip), 
                     .SDcols=c("Latitude","Longitude","date.time")]
result[,date.time:=as.POSIXct(date.time, origin="1970-01-01")]
result
#    ID trip Latitude Longitude           date.time
# 1:  1    1 10.30447 -109.2323 2005-01-07 11:25:26
# 2:  1    1 10.30199 -109.2328 2005-01-07 11:25:56

对距离执行此操作有点复杂,因为我们当然不能在 lon/lat 数据上使用欧几里德距离。下面的解决方案distHaversine(...)geotools包中使用来计算累积 Haversine 距离,然后对其进行插值。这里我们使用 50m 而不是 250m。

library(geosphere)    # for distHaversine
get.dist <- function(lon, lat) distHaversine(tail(cbind(lon,lat),-1),head(cbind(lon,lat),-1))
df[,dist:=c(0,cumsum(get.dist(Longitude,Latitude))),by=list(ID,trip)]

interp.dist <- function(var,dist) approx(dist,var,xout=seq(min(dist),max(dist),by=50))$y
result <- setDT(df)[,lapply(.SD,interp.dist,dist=dist), 
                    by=list(ID,trip), 
                    .SDcols=c("Latitude","Longitude","dist")]

# plot the result
plot(Latitude~Longitude,df, pch=20, asp=1)
lines(Latitude~Longitude,df, col="blue")
points(Latitude~Longitude,result, col="red")
lines(Latitude~Longitude,result, col="red")

请注意,您必须将绘图的纵横比设置为 1:1,否则距离会失真。

于 2015-09-20T20:10:24.547 回答