0

我正在尝试使用maptoolsdplyr中的函数来计算一组 lon/lat/timestamp 坐标的日出时间。sunriset这是一个可重现的例子。

library(maptools)
library(dplyr)

pts <- tbl_df(data.frame(
  lon=c(12.08752,12.08748,12.08754,12.08760,12.08746,12.08748),
  lat=c(52.11760,52.11760,52.11747,52.11755,52.11778,52.11753),
  timestamp=as.POSIXct(
    c("2011-08-12 02:00:56 UTC","2011-08-12 02:20:22 UTC",
      "2011-08-12 02:40:15 UTC","2011-08-12 03:00:29 UTC",
      "2011-08-12 03:20:26 UTC","2011-08-12 03:40:30 UTC"))
))

pts %>% mutate(sunrise=sunriset(as.matrix(lon,lat),
                                timestamp,POSIXct.out=T,
                                direction='sunrise')$time)

当我运行这段代码时,我得到了错误

“错误:无效的下标类型‘闭包’”

我猜这意味着我没有sunriset正确传递变量。

这种方法确实有效,如果我不这样做的话dplyr

pts$sunrise<-sunriset(as.matrix(select(pts,lon,lat)), 
                    pts$timestamp, POSIXct.out=T, 
                    direction='sunrise')$time

但是,我有很多行(大约 6500 万行),即使只有一小部分,上述方法也很慢。我希望 dplyr 会更快。如果有人对最快的方法有其他建议,我很想听听。

4

1 回答 1

3
sunr <- function(lon, lat, ts, dir='sunrise') {
  # can also do matrix(c(pts$lon, pts$lat), ncol=2, byrow=TRUE) vs 
  # as.matrix(data.frame…
  sunriset(as.matrix(data.frame(lon, lat)), ts, POSIXct.out=TRUE, direction=dir)$time
}

pts %>% mutate(sunrise = sunr(lon, lat, timestamp))

is one way to handle it (and has the side-effect of cleaner mutate pipelines) but I'm not sure why you think it will be faster. Either way, the bottleneck is (most likely) the creation of the matrix for the call to sunriset which is going to happen either way.

The maptools source is pretty easy to go through and has a non-exported function maptools:::.sunrisetUTC() that does:

".sunrisetUTC" <- function(jd, lon, lat, direction=c("sunrise", "sunset")) {
## Value: Numeric, UTC time of sunrise or sunset, in minutes from zero
## Z.
## --------------------------------------------------------------------
## Arguments: jd=julian day (real);
## lon=lat=longitude and latitude, respectively, of the observer in
## degrees;
## sunrise=logical indicating whether sunrise or sunset UTC should be
## returned.

You could try doing passing in the julian day, lon, lat & direction to it vs the exported functions to avoid the data copying. However, if performance is critical, I'd use Rcpp to write an inline, vectorized C/C++ function based on this.

于 2015-09-27T16:06:27.287 回答