1

我需要在 R 中的四个不同数据框中添加相同列名的值。问题是这 4 个数据框中的列数不同,其中只有一个数据框包含所有列。其余数据框具有第一个数据框的列名子集。4 个数据帧的行数相等。

最小可复制示例是:

假设有 4 个数据帧,其结构如下:

df1 <- setNames(data.frame(matrix(ncol = 10, nrow = 900)), c("Red", "Blue", "Yellow", "Green", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(ncol = 9, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(ncol = 8, nrow = 900)), c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(ncol = 6, nrow = 900)), c("Red", "Yellow", "Green", "Orange", "Brown", "Purple")

假设四个数据帧中的每一列都有跨越 900 行的整数值。如何返回一个数据框,该数据框基本上是在四个数据框中添加相同列的值?换句话说,df.sum[1:10] <- df1[1:10] + df2[1:9] + df3[1:8] + df4[1:6], 但在添加时标识要添加的相同列

4

1 回答 1

1

如果没有NA元素,我们可以+在使尺寸相同后进行

lst <- mget(paste0("df", 1:4)) # get the datasets in a list
nm1 <- Reduce(union, lapply(lst, names)) # find all the column names
# assign missing columns in each of the dataset with value 0
# get the `+` of all list elements with Reduce
dfout <- Reduce(`+`, lapply(lst, function(x) {
        x[setdiff(nm1, names(x))] <- 0
        x[nm1]}))
dim(dfout)
#[1] 900  10

数据

set.seed(24)
df1 <- setNames(data.frame(matrix(rnorm(900 * 10), ncol = 10, nrow = 900)), 
    c("Red", "Blue", "Yellow", "Green", "Orange", "Pink", 
  "Brown", "Black", "Grey", "Purple"))
df2 <- setNames(data.frame(matrix(rnorm(900 * 9), ncol = 9, nrow = 900)), 
   c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown",
        "Black", "Grey", "Purple"))
df3 <- setNames(data.frame(matrix(rnorm(900 * 8), ncol = 8, nrow = 900)), 
      c("Red", "Blue", "Yellow", "Orange", "Pink", "Brown", "Black", "Purple"))
df4 <- setNames(data.frame(matrix(rnorm(900 * 6), ncol = 6, nrow = 900)),
     c("Red", "Yellow", "Green", "Orange", "Brown", "Purple"))
于 2018-08-16T22:06:34.160 回答