我正在尝试(有效地)重新排列 R 中的数据框。
我的数据是从两个参与者群体(1 或 0,即疾病组和对照组)的四个不同实验中收集的实验数据。
示例数据框:
Subject type Experiment 1 Experiment 2 Experiment 3 Experiment 4
0 4.6 2.5 1.4 5.3
0 4.7 2.4 1.8 5.1
1 3.5 1.2 5.6 7.5
1 3.8 1.7 6.2 8.1
我想重新排列我的数据框,使其结构如下(原因是,当它们在 R 中的结构如下时,它使我更容易在数据上运行函数):
Subject type Experiment Measure
0 1 4.6
0 2 2.5
0 3 1.4
0 4 5.3
0 1 4.7
0 2 2.4
0 3 1.8
0 4 5.1
1 1 3.5
1 2 1.2
1 3 5.6
1 4 7.5
1 1 3.8
1 2 1.7
1 3 6.2
1 4 8.1
如您所见,发生的情况是每个主题现在占据了四行;现在,每一行都与单个测量有关,而不是单个主题。这(至少现在)对我来说更方便插入 R 函数。也许及时我会想出一种完全跳过这一步的方法,但我是 R 新手,这似乎是最好的做事方式。
无论如何-问题是,进行此数据框转换的最有效方法是什么?目前我正在这样做:
# Input dframe1
dframe1 <- structure(list(subject_type = c(0L, 0L, 1L, 1L), experiment_1 = c(4.6,
4.7, 3.5, 3.8), experiment_2 = c(2.5, 2.4, 1.2, 1.7), experiment_3 = c(1.4,
1.8, 5.6, 6.2), experiment_4 = c(5.3, 5.1, 7.5, 8.1)), .Names = c("subject_type",
"experiment_1", "experiment_2", "experiment_3", "experiment_4"
), class = "data.frame", row.names = c(NA, -4L))
# Create a matrix
temporary_matrix <- matrix(ncol=3, nrow=nrow(dframe1) * 4)
colnames(temporary_matrix) <- c("subject_type","experiment","measure")
# Rearrange dframe1 so that a different measure is in each column
for(i in 1:nrow(dframe1)) {
temporary_matrix[i*4-3,"subject_type"] <- dframe1$subject_type[i]
temporary_matrix[i*4-3,"experiment"] <- 1
temporary_matrix[i*4-3,"measure"] <- dframe1$experiment_1[i]
temporary_matrix[i*4-2,"subject_type"] <- dframe1$subject_type[i]
temporary_matrix[i*4-2,"experiment"] <- 2
temporary_matrix[i*4-2,"measure"] <- dframe1$experiment_2[i]
temporary_matrix[i*4-1,"subject_type"] <- dframe1$subject_type[i]
temporary_matrix[i*4-1,"experiment"] <- 3
temporary_matrix[i*4-1,"measure"] <- dframe1$experiment_3[i]
temporary_matrix[i*4-0,"subject_type"] <- dframe1$subject_type[i]
temporary_matrix[i*4-0,"experiment"] <- 4
temporary_matrix[i*4-0,"measure"] <- dframe1$experiment_4[i]
}
# Convert matrix to a data frame
dframe2 <- data.frame(temporary_matrix)
# NOTE: For some reason, this has to be converted back into a double (at some point above it becomes a factor)
dframe2$measure <- as.double(as.character(dframe2$measure))
当然有更好的方法来做到这一点?!