0

我有一个数据框列表:

set.seed(23) 
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)

testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")

我现在想提取每个数据框的名称并将其添加到其列中。所以列表中第一个数据框的列名应该是:Date_df1464, ABC_df1464, DEF_df1464, GHI_df1464 and JKL_df1464

我创建了这个循环,但它不起作用:

for (a  in names(testlist)) {
  for(i in 1: length(testlist)){
    allcolnames = colnames(testlist[[i]])
    allcolnames = paste(allcolnames, a , sep = "_")
    testlist[[i]] = colnames(allcolnames)
  }
}

我收到此错误:

Error in testlist[[i]] : subscript out of bounds

我很不知道为什么它不起作用。有任何想法吗?

4

3 回答 3

2

您可以串联切换两个Map;内部Map准备新名称,外部Map将其应用于子列表的名称。

testlist <- Map(`names<-`, testlist,
                Map(paste, lapply(testlist, names), names(testlist), sep="_"))

结果

lapply(testlist, names)
# $df1464
# [1] "Date_df1464" "ABC_df1464"  "DEF_df1464"  "GHI_df1464"  "JKL_df1464" 
# 
# $df6355
# [1] "Date_df6355" "ABC_df6355"  "DEF_df6355"  "GHI_df6355"  "JKL_df6355" 
# 
# $df94566
# [1] "Date_df94566" "ABC_df94566"  "DEF_df94566"  "GHI_df94566"  "JKL_df94566" 
于 2019-06-17T12:38:19.763 回答
2

有两种方法可以做到这一点。更好,更封装的方法是使用Map,循环各个数据帧及其相应的名称:

new.testlist <- Map(function(df, name) {
  names(df) <- paste(names(df), name, sep = '_')
  return(df)
}, testlist, names(testlist))

> str(new.testlist)
List of 3
 $ df1464 :'data.frame':    30 obs. of  5 variables:
  ..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
 $ df6355 :'data.frame':    30 obs. of  5 variables:
  ..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
 $ df94566:'data.frame':    30 obs. of  5 variables:
  ..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
  ..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
  ..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
  ..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
  ..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...

风险更大的方法是使用超级赋值运算符来循环名称,相信它testlist在您的全局环境中仍然可靠。请注意,第二种方法会更改列名testlist作为副作用,通常不被认为是好的做法。Max Teflon 的答案有些相似,因为它依赖于testlist存在于全局环境中,而不是将其显式传递给修改函数。

sapply(names(testlist), function(x) {
  names(testlist[[x]]) <<- paste(names(testlist[[x]]), x, sep = '_')
})
于 2019-06-17T12:41:24.867 回答
1

您的解决方案几乎是正确的,您只是不需要循环两次。你的colnames电话是错误的方式。这应该有效:

for(i in 1: length(testlist)){
    allcolnames = colnames(testlist[[i]])
    allcolnames = paste(allcolnames, names(testlist)[i] , sep = "_")
    colnames(testlist[[i]]) = allcolnames
}

这也有效,没有任何 fors ;):

set.seed(23) 
date_list = seq(1:30)
testframe = data.frame(Date = date_list)
testframe$ABC = rnorm(30)
testframe$DEF = rnorm(30)
testframe$GHI = seq(from = 10, to = 25, length.out = 30)
testframe$JKL = seq(from = 5, to = 45, length.out = 30)

testlist = list(testframe, testframe, testframe)
names(testlist) = c("df1464", "df6355", "df94566")

out <- lapply(names(testlist),function(name){
  dummy <- testlist[[name]]
  names(dummy) <- paste0(names(testlist[[name]]) ,'_',name)
  dummy
})
str(out)
#> List of 3
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df1464: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df1464 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df1464 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df1464 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df1464 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df6355: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df6355 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df6355 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df6355 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df6355 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
#>  $ :'data.frame':    30 obs. of  5 variables:
#>   ..$ Date_df94566: int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
#>   ..$ ABC_df94566 : num [1:30] 0.193 -0.435 0.913 1.793 0.997 ...
#>   ..$ DEF_df94566 : num [1:30] -0.5532 0.0982 -1.1467 -1.2499 -0.2021 ...
#>   ..$ GHI_df94566 : num [1:30] 10 10.5 11 11.6 12.1 ...
#>   ..$ JKL_df94566 : num [1:30] 5 6.38 7.76 9.14 10.52 ...
于 2019-06-17T12:34:01.707 回答