r - fisher.test 许多文件（已经有表格数据）输入

Question

我想在 R 中做fisher.test。

我已经有列联表的数据（在单独的 file.txt 中）。

我想要：

输入文件并根据名称进行匹配；
输入匹配的文件数据进行测试；

-所有文件看起来像这样：

 56
 989

所有文件只有两行（#1 发生和#2 未发生）；

- 文件名是：

Anna_50.txt
Anna_100.txt
Anna_200.txt
Ben_50.txt
Ben_100.txt
Ben_200.txt

-我想为 Anna_50 和 Ben_50 做 Fisher 测试；Anna_100 & Ben_100 等：

-问题：

files <- list.files()

如何匹配文件中的 Anna_50 和 Ben_50；

如何创建矩阵作为输入顺序是棘手的。

table <- matrix(c(Anna_50_Occ, Ben_50_Occ, Anna_50_NonOn, Ben_50_NonO)2,2)

如何在所有文件上运行它？

期待您的回答。试图尽可能清楚地说明这一点 - 我真的需要这个，但如果仍有不清楚的地方，请毫不犹豫地问。

score 6 · Accepted Answer

我有一些代码可以解决问题。但是，由于我没有您的文件，最后一部分可能会失败。

思路如下。首先，您从中读取数字files。然后，您创建两个包含文件名的向量。一份用于所有 Anna 文件，一份用于 Ben 文件。然后创建一个函数，用于对其中两个对象运行 Fisher 测试。最后的魔法是通过mapply同时迭代两个文件名向量来实现的：

files <- c("Anna_50.txt", "Anna_100.txt", "Anna_200.txt", "Ben_50.txt", 
    "Ben_100.txt", "Ben_200.txt")

# get the numbers from the filenames
numbers <- vapply(strsplit(vapply(strsplit(files, "\\."), "[", i = 1, ""), "_"), "[", i = 2, "")

# only use those numbers that appear two times:
t.num <- table(numbers)
valid.num <- dimnames(t.num)[[1]][t.num == 2]

# make vector for Anna and Ben (that now have the same ordering)
f.anna <- paste("Anna_", valid.num, ".txt", sep = "")
f.ben <- paste("Ben_", valid.num, ".txt", sep = "")

#Now you can use mapply with a suitable function
# Did not check it as I dont have the files, but the logic should become clear:
run.fisher <- function(file1, file2) {
    d1 <- scan(file1)
    d2 <- scan(file2)
    d.matrix <- matrix(c(d1, d2), byrow = TRUE)
    fisher.test(d.matrix)
}

# now use mapply to obtain a list with all results:

mapply(run.fisher, f.anna, f.ben)

更新：实际上，您可以减少从文件名中获取数字的行：

files <- vapply(strsplit(files, "[\\._]"), "[", i = 2, "")

r - fisher.test 许多文件（已经有表格数据）输入

1 回答 1

Related

Reference