0

我有一个数据框,称为Paper每年的论文引用,包括他们的出版年份以及一些元数据(期刊、作者)。看起来像:

Paper = read.table(textConnection("Meta Publication.Year X1999 X2000 X2001 X2002 X2003
A 1999 0 1 1 1 2
B 2000 0 0 3 1 0
C 2000 0 0 1 0 1
C 2001 0 0 0 1 5
D 1999 0 1 0 2 2"), header = TRUE)

我想计算出版两年后的引用总和,并将此列表附加到Paper. 但是,我对每一年都不感兴趣,只对列表中指定的那些感兴趣Years。我的步骤(下面的代码)如下: Order Paperacc。Publication.Year,选择Publication.Year第一年的 X 行(即1999X2000X2001),计算总和,将总和绑定在一起,cbind 到Paper.

是否有(更多)优雅的方式来做到这一点?

Years = as.numeric(c(1999, 2000))
Paper <- Paper[with(Paper, order(Paper[["Publication.Year"]])), ]
Two.Year = as.numeric()
for (i in Years){
Mat <- subset(Paper, Paper[["Publication.Year"]]==i, select=c("Publication.Year", paste("X", i+1, sep=""), paste("X", i+2, sep="")))
temp <- rowSums(Mat[,-1])
Two.Year <- c(Two.Year, temp)
rm(temp)
}
Paper <- cbind(Paper, Two.Year)
rm(Two.Year)
rm(Jahre)
Paper <- subset(Paper, select=c("Meta","Publication.Year","Two.Year")) # Because in the end I only need the citation number
4

1 回答 1

0

因为每一行您感兴趣的年份都会发生变化,因此您将不得不创建新变量来指示这些年份。然后你可以使用mapply对正确的数字求和。

Paper$pubYear1 <- paste0("X", as.character(Paper$Publication.Year + 1))
Paper$pubYear2 <- paste0("X", as.character(Paper$Publication.Year + 2))
Paper$pubCount <- mapply(function(r, y1, y2) Paper[r, y1] + Paper[r, y2], 
  row.names(Paper), Paper$pubYear1, Paper$pubYear2)

这是生成的数据框:

> Paper
  Meta Publication.Year X1999 X2000 X2001 X2002 X2003 pubYear1 pubYear2 pubCount
1    A             1999     0     1     1     1     2    X2000    X2001        2
2    B             2000     0     0     3     1     0    X2001    X2002        4
3    C             2000     0     0     1     0     1    X2001    X2002        1
4    C             2001     0     0     0     1     5    X2002    X2003        6
5    D             1999     0     1     0     2     2    X2000    X2001        1
于 2014-08-02T23:17:06.323 回答