我试图从它们的 data.frame 结构中剥离两个数据帧,提取每个 data.frame 中的元素,并将从数据帧中提取的数据合并到一个 data.frame 中。这应该会产生一个由两列作为向量组成的 data.frame。请参阅下面的输出(以粗体标记)。
问题:输出包含多个 data.frame 元素,而不是包含来自输入数据帧的向量的单个 data.frame。
每个数据框包含一个向量。
[编辑^v回应评论。]
到目前为止,我尝试了各种组合as()
但unlist()
无济于事......
我正在尝试使用内置的 R 函数和矢量化来解决这个问题(不使用plyr
and loops
:使用循环将多个 data.frames 合并到一个 data.frame 中,从 csv 文件中合并许多数据帧,将 Data.frames 列表重新组合到单个数据框)
可重现的代码:我无法复制错误,但这是我希望我的代码能够工作的方式:
df1<-data.frame<-c(1, 2, 3)
df2<-data.frame<-c(2, 4, 6)
output<-cbind(df1, df2)
print(output) #Returns a data.frame
str(output) # of vectors
#In my case however, a data.frame returns data.frames)
这将返回:
df1 df2
[1,] 1 2
[2,] 2 4
[3,] 3 6
现实:
readmultiple <- function(directory = "bigdata") {
....
....
....
output <- cbind.data.frame(filename, readmultiplesum)
# This is probably where things go wrong
return(output)
}
output <- lapply(filenames, complete.cases.sum)
assign("Global.output", output, envir = .GlobalEnv)
# There is probably a better way to do this too
if (firstoutput == 1) {
Global.output <- merge(as(unlist(Global.output[1]), "vector"),
as(unlist(output[1])), "vector")
# as, unlist... Not sure what's needed here
} else {
firstoutput <- 1
}
str(output)
return(Global.output)
}
输出看起来像
[[1]]
filename result
1 142
[[2]]
filename result
1 521
[[3]]
filename result
1 324
但我希望它是
filename result
[1,] filename[i] 142
[2,] filename[i] 521
[3,] filename[i] 324
...其中 filename[i] 是文件名的索引。
str(输出) 返回
List of 2400
$ :'data.frame': 1 obs. of 2 variables:
..$ filename : Factor w/ 1 level "bigdata/001.csv": 1
..$ sumrows: num 142
$ :'data.frame': 1 obs. of 2 variables:
..$ filename : Factor w/ 1 level "bigdata/001.csv": 1
..$ sumrows: num 521
$ :'data.frame': 1 obs. of 2 variables:
..$ filename : Factor w/ 1 level "bigdata/001.csv": 1
..$ sumrows: num 324
$ :'data.frame': 1 obs. of 2 variables:
..$ filename : Factor w/ 1 level "bigdata/001.csv": 1
.....
dput(head(output)) 返回
list(structure(list(filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 142), .Names = c("filename", "sumrows"), row.names = c(NA,
-1L), class = "data.frame"), structure(list(filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 521), .Names = c("filename", "sumrows"
), row.names = c(NA, -1L), class = "data.frame"), structure(list(
filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 324), .Names = c("filename", "sumrows"), row.names = c(NA,
-1L), class = "data.frame"), structure(list(filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 1896), .Names = c("filename", "sumrows"
), row.names = c(NA, -1L), class = "data.frame"), structure(list(
filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 1608), .Names = c("filename", "sumrows"
), row.names = c(NA, -1L), class = "data.frame"), structure(list(
filename = structure(1L, .Label = "bigdata/001.csv", class = "factor"),
sumrows = 912), .Names = c("filename", "sumrows"), row.names = c(NA,
-1L), class = "data.frame"))