1

我正在构建一个 R 脚本,该脚本旨在多次查询数据库(一个用于 3 个向量元素的每个排列,但我很难弄清楚如何使用它 ldply来实现这一点。

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")


query <- function(tag, time, timestep) {

  sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")

  # pretend the line below is actually querying a database and returning a DF with one row
  data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))

}
# function works correctly!  
query(time = times[1], tag = tags[1], timestep = timesteps[1])

# causes an error! (Error in FUN(X[[1L]], ...) : unused argument(s) (X[[1]]))
ldply(times, query, time = times, tag = tags, timestep = timesteps)

我以为我可以使用 ldply 嵌套三次,每个向量一个,但我什至没有离开第一级!

有什么想法我能做什么?

4

2 回答 2

3

我认为如果您使用mdply(或等效地mapply),这将大大简化:

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")


query <- function(tags, times, timesteps) {

  sql <- paste("select tag, time, timestep, value from mydb where 
            tag = '",tags,"' and time = '",times,"' and timestep = '",timesteps,"'", sep="")
  # pretend the line below is actually querying a database and returning a DF with one row
  data.frame(tag = tags, time = times, timestep = timesteps, value = rnorm(1))

}

dat <- expand.grid(tags, times, timesteps)
colnames(dat) <- c('tags','times','timesteps')

mdply(dat,query)

请注意变量名称的微小变化,以使它们在数据和函数参数中都一致。

于 2012-08-08T15:00:24.550 回答
1

这将完成工作,但它只使用 apply。首先,我使用感兴趣的组合创建一个对象,然后我重写查询以从该对象中获取一行,而不是 3 个输入。

tags <- c("tag1","tag2","tag3")
times <- c("2012-08-01 13:00:00","2012-08-07 21:00:00")
timesteps <- c("2m", "10m","60m", "90m")

# Use expand.grid to create an object with all the combinations
dat <- expand.grid(tags, times, timesteps)

# Rewrite query to take in a row of dat
query <- function(row) {
    # extract the pieces of interest
    tag <- row[1]
    time <- row[2]
    timestep <- row[3]

    sql <- paste("select tag, time, timestep, value from mydb where tag = '",tag,"' and time = '",time,"' and timestep = '",timestep,"'", sep="")

    # pretend the line below is actually querying a database and returning a DF with one row
    data.frame(tag = tag, time = time, timestep = timestep, value = rnorm(1))

}

# function works correctly on a single row  
query(dat[1,])

# apply the function to each row
j <- apply(dat, 1, query)
# bind all the output together
do.call(rbind, j)
于 2012-08-08T14:48:50.493 回答