r - R中的行和列总和

Question

这是我的数据集 ( MergedData) 在 R 中的样子的示例，其中我的每个参与者（5 行）在每次测试（7 列）中获得了一个分数。我想知道所有测试的总分（所有列），但每个参与者（行）。

此外，我的完整数据集不仅仅是这几个变量，所以如果可能的话，我想使用公式和循环来完成，而不必逐行/逐列输入。

Participant TestScores     
ParticipantA    2   4   2   3   2   3   4
ParticipantB    1   3   2   2   3   3   3
ParticipantC    1   4   4   2   3   4   2
ParticipantD    2   4   2   3   2   4   4
ParticipantE    1   3   2   2   2   2   2

我已经尝试过了，但它不起作用：

Test_Scores <- rowSums(MergedData[Test1, Test2, Test3], na.rm=TRUE)

我收到以下错误消息：

Error in `[.data.frame`(MergedData, Test1, Test2, Test3,  : 
  unused arguments

我该如何解决这个问题？谢谢！！

score 12 · Accepted Answer

12

我想你想要这个：

rowSums(MergedData[,c('Test1', 'Test2', 'Test3')], na.rm=TRUE)

于 2014-05-09T15:24:12.963 回答

score 2 · Accepted Answer

你可以使用：

MergedData$Test_Scores_Sum <- rowSums(MergedData[,2:8], na.rm=TRUE)

2:8您要总结的所有列（测试）在哪里。这样，它将在您的数据中创建另一列。

这样，您不必键入每个列名，并且您的数据框中仍然可以有其他列，这些列不会被汇总。但是请注意，您要总结的所有测试列都应该彼此相邻（如示例数据中所示）。

score 1 · Accepted Answer

请查阅和的?rowSums文档?colSums。

从你的帖子中不清楚到底MergedData是什么。假设它是 a data.frame，问题是你的 indexing MergedData[Test1, Test2, Test3]。如果它是data.frame，你想运行类似的东西：

Test_Scores <- rowSums(MergedData, na.rm = TRUE)

或者

Test_Scores <- rowSums(MergedData[, c("Test1", "Test2", "Test3")], na.rm = TRUE)

如果您只想使用名为"Test1"、"Test2"和的列"Test3"（如果它们确实如此命名）。

如果这不起作用。请向我们展示str(MergedData).

您需要提供一个最小可重现的错误示例，以获得任何真正有用的答案。

score 1 · Accepted Answer

对于小数据，将转换data.frame为tablethen use可能会很有趣addmargins()。

有了这个样本数据

MergedData<-data.frame(Participant=letters[1:5],
    Test1 = c(2,1,1,2,1),
    Test2 = c(4,3,4,4,3),
    Test3 = c(2,2,4,2,2),
    Test4 = c(3,2,2,3,2),
    Test5 = c(2,3,3,2,2)
)

这个辅助函数

as.table.data.frame<-function(x, rownames=0) {
    numerics <- sapply(x,is.numeric)
    chars <- which(sapply(x,function(x) is.character(x) || is.factor(x)))
    names <- if(!is.null(rownames)) {
        if (length(rownames)==1) {
            if (rownames ==0) {
                 rownames(x)
            } else {
                as.character(x[,rownames])
            }
        } else {
            rownames
        }
    } else {
          if(length(chars)==1) {
            as.character(x[,chars])
        } else {
            rownames(x)
        }
    }
    x<-as.matrix(x[,numerics])
    rownames(x)<-names
    structure(x, class="table")
}

你可以做

addmargins(as.table(MergedData))

要得到

    Test1 Test2 Test3 Test4 Test5 Sum
a       2     4     2     3     2  13
b       1     3     2     2     3  11
c       1     4     4     2     3  14
d       2     4     2     3     2  13
e       1     3     2     2     2  10
Sum     7    18    12    12    12  61

在这种情况下可能不是超级有用，但addmargins仍然很有趣。

score 0 · Accepted Answer

四个先前的答案，只有一个显示结果？那是怎么回事？这是一个

> dat <- read.table(header=T, text = 
  'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

你写的

“...如果可能的话，我想使用公式和循环来完成，而不必逐行输入 > 行/列”

您根本不必编写任何循环。行和列函数对所有行和所有列进行操作，没有循环。

> rowSums(dat[-1], na.rm = TRUE)
## [1] 20 17 20 21 14
> colSums(dat[-1], na.rm = TRUE)
##  Test1  Test2  Test3  Test4  Test5  Test6  Test7 
##      7     18     12     12     12     16     15

score 0 · Accepted Answer

这是一种使用dplyrand的方法reshape2：

dat <- read.table(header=T, text = 
                    'Participant Test1 Test2 Test3 Test4 Test5 Test6 Test7     
  ParticipantA    2   4   2   3   2   3   4
  ParticipantB    1   3   2   2   3   3   3
  ParticipantC    1   4   4   2   3   4   2
  ParticipantD    2   4   2   3   2   4   4
  ParticipantE    1   3   2   2   2   2   2')

library(dplyr) 
library(reshape2)    

# Melt data into long format
dat.l = melt(dat, id.var="Participant", variable.name="Test")    
> dat.l
    Participant  Test value
1  ParticipantA Test1     2
2  ParticipantB Test1     1
3  ParticipantC Test1     1
4  ParticipantD Test1     2
...
32 ParticipantB Test7     3
33 ParticipantC Test7     2
34 ParticipantD Test7     4
35 ParticipantE Test7     2

# Sum by Participant
dat.l %.%
  group_by(Participant) %.%
  summarise(Sum=sum(value))

   Participant Sum
1 ParticipantA  20
2 ParticipantB  17
3 ParticipantC  20
4 ParticipantD  21
5 ParticipantE  14

# Sum by Test
dat.l %.%
  group_by(Test) %.%
  summarise(Sum=sum(value))

   Test Sum
1 Test1   7
2 Test2  18
3 Test3  12
4 Test4  12
5 Test5  12
6 Test6  16
7 Test7  15

r - R中的行和列总和

6 回答 6

Related

Reference