R新手(ish)。我已经编写了一些在 R 中使用for()循环的代码。我想以矢量化形式重写它,但它不起作用。
用于说明的简化示例:
library(dplyr)
x <- data.frame(name = c("John", "John", "John", "John", "John", "John", "John", "John", "Fred", "Fred"),
year = c(1, NA, 2, 3, NA, NA, 4, NA, 1, NA))
## if year is blank and name is same as name from previous row
## take year from previous row
## else
## stick with the year you already have
# 1. Run as a loop
x$year_2 <- NA
x$year_2[1] <- x$year[1]
for(row_idx in 2:10)
{
if(is.na(x$year[row_idx]) & (x$name[row_idx] == x$name[row_idx - 1]))
{
x$year_2[row_idx] = x$year_2[row_idx - 1]
}
else
{
x$year_2[row_idx] = x$year[row_idx]
}
}
# 2. Attempt to vectorise
x <- data.frame(name = c("John", "John", "John", "John", "John", "John", "John", "John", "Fred", "Fred"),
year = c(1, NA, 2, 3, NA, NA, 4, NA, 1, NA))
x$year_2 <- ifelse(is.na(x$year) & x$name == lead(x$name),
lead(x$year_2),
x$year)
我认为矢量化版本被搞砸了,因为它有一个循环性(即x$year_2出现在 的两侧<-)。有没有办法解决这个问题?
谢谢你。