r - 如何在 R 中编写循环？

Question

我如何编写一个循环，以便一个接一个地计算所有八个表？

编码：

dt_M1_I <- M1_I
dt_M1_I <- data.table(dt_M1_I)
dt_M1_I[,I:=as.numeric(gsub(",",".",I))]
dt_M1_I[,day:=substr(t,1,10)]
dt_M1_I[,hour:=substr(t,12,16)]
dt_M1_I_median <- dt_M1_I[,list(median_I=median(I,na.rm = TRUE)),by=.(day,hour)]

这应该计算为：

M1_I
M2_I
M3_I
M4_I
M1_U
M2_U
M3_U
M4_U

非常感谢您的帮助！

score 5 · Accepted Answer

每当您有多个相同类型的变量时，尤其是当您发现自己对它们进行编号时，就像您所做的那样，退后一步并用单个列表变量替换它们。我不建议按照其他答案的建议去做。

也就是说，不是M1_I…<code>M4_I 和M1_U…<code>M4_U，而是有两个变量m_iand m_u（在变量名中使用小写是常规的），它们是四个data.tables 的列表。

或者，您可能想要使用单个变量，m其中包含 data.tables ( m = list(list(i = …, u = …), …)) 的嵌套列表。

假设第一个，您可以按如下方式迭代它们：

give_this_a_meaningful_name = function (df) {
    dt <- data.table(df)
    dt[, I := as.numeric(gsub(",", ".", I))]
    dt[, day := substr(t, 1, 10)]
    dt[, hour := substr(t, 12, 16)]
    dt[, list(median_I = median(I, na.rm = TRUE)), by = .(day, hour)]
}

m_i_median = lapply(m_i, give_this_a_meaningful_name)

（还要注意在运算符周围引入一致的间距；良好的可读性对于编写无错误的代码至关重要。）

score 0 · Accepted Answer

You can use a combination of a for loop and the get/assign functions like this:

# create a vector of the data.frame names
dts <- c('M1_I', 'M2_I', 'M3_I', 'M4_I', 'M1_U', 'M2_U', 'M3_U', 'M4_U')

# iterate over each dataframe
for (dt in dts){

  # get the actual dataframe (not the string name of it)
  tmp <- get(dt)
  tmp <- data.table(tmp)
  tmp[, I:=as.numeric(gsub(",",".",I))]
  tmp[, day:=substr(t,1,10)]
  tmp[, hour:=substr(t,12,16)]
  tmp <- tmp[,list(median_I=median(I,na.rm = TRUE)),by=.(day,hour)]

  # assign the modified dataframe to the name you want (the paste adds the 'dt_' to the front)
  assign(paste0('dt_', dt), tmp)

}

r - 如何在 R 中编写循环？

2 回答 2

Related

Reference