r - 在 R 中，如何从多个文件创建数据框，其中每个文件包含单独类别的测量值？

Question

假设我有一个文件A，其中包含 10 名接受某种治疗的受试者的测量结果，还有一个文件B包含另外 10 名接受不同治疗的受试者的测量结果。我想对方差进行单向分析，所以我使用 R 的anova/aov函数。但是，aov期望数据位于数据框中，其中第一列包含类别（即此处为Aor B），第二列包含相应的样本。如何读取这两个文件并自动构建数据框？

score 1 · Accepted Answer

这是我最近编写的一些代码来解决同样的问题。对我来说，数据位于名为的 CSV 文件中blahblah_series_trials.csv，其中blahblah确定了实验类型。

filenames <- dir(".", "*.series_trials.csv")
types <- sub('.*?([a-zA_Z]*)_series_trials.*', '\\1', filenames)
data <- adply(data.frame(f=I(filenames), t=types), 1,
              with, cbind(read.csv(f), exp_type=t))

这会读取每个文件，exp_type根据它来自哪个文件添加一列，并将其全部绑定到一个数据框中。

score 1 · Accepted Answer

我必须这样做，所以我在这里提出解决方案。

# Define a new function: files is a vector of file names.
# The return value is a data frame where the x column contains the category
# (the file name) and the y column contains the corresponding samples.
read.files <- function(files) {
    l <- lapply(files, function (x) read.table(x)$V1)
    return(data.frame(
        x = factor(unlist(lapply(seq_along(l), function(i) sapply(c(1:length(l[[i]])), function(x) files[i])))),
        y = unlist(l)
    ))
}

f <- read.files(c("A", "B"))

anova(aov(y ~ x, f))

f 的输出如下所示：

   x    y
1  A 10.0
2  A 10.1
3  A 11.1
4  A 12.9
5  A 10.7
6  A  9.6
7  A 10.4
8  A 10.8
9  A 10.1
10 A  9.3
11 B 20.5
12 B 21.1
13 B 25.2
14 B 13.2
15 B 13.3
16 B 17.4
17 B 18.9
18 B 20.2
19 B 23.8

这适用于任意数量的文件，但每个文件仅限于单个列。这些文件可以有不同的行数。

r - 在 R 中，如何从多个文件创建数据框，其中每个文件包含单独类别的测量值？

2 回答 2

Related

Reference