0

我有这样列出的数据集:

.

data <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("I", "II", "III"), class = "factor"), 
    time = c(1L, 1L, 1L, 2L, 2L, 3L, 1L, 1L, 1L, 2L, 2L, 1L, 
    2L, 2L, 2L), species = structure(c(1L, 2L, 3L, 2L, 4L, 1L, 
    3L, 2L, 1L, 3L, 4L, 1L, 1L, 3L, 4L), .Label = c("a", "b", 
    "c", "d"), class = "factor")), .Names = c("group", "time", 
"species"), class = "data.frame", row.names = c(NA, -15L))
head(data)


##     group time species
## 1     I    1       a
## 2     I    1       b
## 3     I    1       c
## 4     I    2       b
## 5     I    2       d
## 6     I    3       a

我正在为在同一时间块中同时出现的物种创建共现表。此处示例的代码为 I 组中的物种创建了一个共现表:

data2=subset(data,data$group=="I")    
X =table(data2$species,data2$time)    
X <- as.matrix(X)    
out <- X %*% t(X)

write.table(out,"coocurrence_groupI.txt",sep="\t")

我的原始数据集实际上有很多组;对每一个进行子集化,然后创建一个 .txt 文件似乎太多余了。我的问题是如何创建一个循环函数,自动为每个组(示例中的 I、II 和 III)创建共现表,然后为每个组编写不同的 .txt 文件?

我在互联网上搜索并没有找到任何接近的东西,除了 sapply (我不完全确定这是正确的操作)。也许我没有找对地方。任何帮助将不胜感激。

克克罗斯

4

1 回答 1

1

我偏爱split apply解决这类问题的方法,尽管正如 mnel 指出的那样,还有其他选择。

我会将您的矩阵事物转换为一个函数,然后按组拆分数据并将该函数应用于每个组,如下所示:

#your data renamed dat (data is an R function so avoid using that as a name
dat <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("I", "II", "III"), class = "factor"), 
    time = c(1L, 1L, 1L, 2L, 2L, 3L, 1L, 1L, 1L, 2L, 2L, 1L, 
    2L, 2L, 2L), species = structure(c(1L, 2L, 3L, 2L, 4L, 1L, 
    3L, 2L, 1L, 3L, 4L, 1L, 1L, 3L, 4L), .Label = c("a", "b", 
    "c", "d"), class = "factor")), .Names = c("group", "time", 
"species"), class = "data.frame", row.names = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15"
))

#your processing turned into a function
FUN <- function(DATA) {
    X <- table(DATA[, 2],DATA[, 1])    
    X <- as.matrix(X)    
    X %*% t(X)
}

#the split lapply method
X <- split(dat[, 2:3], dat[, 1])    
lapply(X, FUN)

这产生:

$I

    a b c d
  a 2 1 1 0
  b 1 2 1 1
  c 1 1 1 0
  d 0 1 0 1

$II

    a b c d
  a 1 1 1 0
  b 1 1 1 0
  c 1 1 2 1
  d 0 0 1 1

$III

    a b c d
  a 2 0 1 1
  b 0 0 0 0
  c 1 0 1 1
  d 1 0 1 1

编辑:我很抱歉我错过了你想把每一个都写到一个文件中。我这样做了,但您可能想考虑使用上述函数的输出的saveor函数,而不是编写多个 txt 文件:saveRDS

v <- split(dat[, 2:3], dat[, 1])    
Output <- lapply(seq_along(v), function(i) {
        X <- table(v[[i]][, 2], v[[i]][, 1])    
        X <- as.matrix(X)    
        z <- X %*% t(X)
        write.table(z, paste0("coocurrence_group", names(v)[i], ".txt"),sep="\t")
        return(z)
    }
)

names(Output) <- names(v)
Output
于 2012-08-13T23:28:03.220 回答