3

我有三个不同的数据框,如下所示:

V1.x<-c(1,2,3,4,5)
V2.x<-c(2,2,7,3,1)
V3.x<-c(2,4,3,2,9)
D1<-data.frame(ID=c("A","B","C","D","E"),V1.x=V1.x,V2.x=V2.x,V3.x=V3.x)

V1.y<-c(2,3,3,3,5)
V2.y<-c(1,2,3,3,5)
V3.y<-c(6,4,3,2,2)
D2<-data.frame(ID=c("A","B","C","D","E"),V1.y=V1.y,V2.y=V2.y,V3.y=V3.y)

V1<-c(3,2,4,4,5)
V2<-c(3,7,3,4,5)
V3<-c(5,4,3,6,3)
D3<-data.frame(ID=c("A","B","C","D","E"),V1=V1,V2=V2,V3=V3)

我想将所有 V1 列、所有 V2 列和所有 V3 列相加

V1_Add<-D1$V1.x+D2$V1.y+D3$V1
V2_Add<-D1$V2.x+D2$V2.y+D3$V2
V3_Add<-D1$V3.x+D2$V3.y+D3$V3

可以很好地获取各个列的总和,但在实际数据中,列号来自 V1:V80,因此不必单独输入每个列会很棒。此外,我更希望得到一个包含所有最终总和的数据框,如下所示:

  ID  V1  V2  V3
1  A  6  6   13
2  B  7  11  12
3  C  10 13  9
4  D  11 10  10
5  E  15 11  14
4

4 回答 4

2

这是你想要的吗?

D.Add <- data.frame(D1[,1],(D1[,-1]+D2[,-1]+D3[,-1]))
colnames(D.Add)<-colnames(D3)
于 2012-06-08T00:26:04.147 回答
2
library(reshape2)
library(plyr)

# First let's standardize column names after ID so they become V1 through Vx. 
# I turned it into a function to make this easy to do for multiple data.frames
standardize_col_names <- function(df) {
# First column remains ID, then I name the remaining V1 through Vn-1 
# (since first column is taken up by the ID)
names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep=""))
return(df)
}

D1 <- standardize_col_names(D1)
D2 <- standardize_col_names(D2)
D3 <- standardize_col_names(D3)

# Next, we melt the data and bind them into the same data.frame
# See one example with melt(D1, id.vars=1). I just used rbind to combine those
melted_data <- rbind(melt(D1, id.vars=1), melt(D2, id.vars=1), melt(D3, id.vars=1))
# note that the above step can be folded into the function as well. 
# Then you throw all the data.frames into a list and ldply through this function.

# Finally, we cast the data into what you need which is the sum of the columns
 dcast(melted_data, ID~variable, sum)
  ID V1 V2 V3
1  A  6  6 13
2  B  7 11 12
3  C 10 13  9
4  D 11 10 10
5  E 15 11 14



 # Combined everything above more efficiently :

   standardize_df <- function(df) {
    names(df) <- c("ID", paste("V",1:(dim(df)[2]-1),sep=""))
    return(melt(df, id.vars = 1))
    }

   all_my_data <- list(D1,D2,D3)
   melted_data <- ldply(all_my_data, standardize_df)
   summarized_data <- dcast(melted_data, ID~variable, sum)
于 2012-06-08T00:50:20.790 回答
2

这是一种可能有点矫枉过正的方法,但应该可以很好地推广到任意数量的列和任意数量的“索引”列。它确实假设您所有的 data.frames 具有相同数量的列并且它们的顺序正确。首先,从所有 data.frames 中创建一个列表对象。我引用了这个问题以编程方式做到这一点。

ClassFilter <- function(x, class) inherits(get(x), "data.frame")
Objs <- Filter( ClassFilter, ls() )
Objs <- lapply(Objs, "get")

接下来,我编写了一个函数,通过 using 将所有数字列添加在一起Reduce,然后将其与最后的非数字列缝合在一起:

FUN <- function(x){
  colsToProcess <- lapply(x, function(y) y[, unlist(sapply(y, is.numeric))])
  result <- Reduce("+", colsToProcess)
  #Get the non numeric columns
  nonNumericCols <- x[[1]]  
  nonNumericCols <- nonNumericCols[, !(unlist(sapply(nonNumericCols, is.numeric)))]
  return(data.frame(Index = nonNumericCols, result))
}

最后,在行动中:

> FUN(Objs)
  Index V1.x V2.x V3.x
1     A    6    6   13
2     B    7   11   12
3     C   10   13    9
4     D   11   10   10
5     E   15   11   14
于 2012-06-08T01:08:50.773 回答
0

这个怎么样,只是把整个块加起来?:

D1[,2:4] + D3[,2:4] + D2[,2:4]

... 结果是 ...

  V1.x V2.x V3.x
1    6    6   13
2    7   11   12
3   10   13    9
4   11   10   10
5   15   11   14

它假定所有变量的顺序相同,否则应该可以正常工作。

于 2012-06-08T07:38:14.607 回答