我有大量的 SPSS 或文本数据文件。read.spss
从库中导入 R 中的 SPSS 文件foreign
时,使用时会自动添加值标签use.value.labels = TRUE
。这些存储为value.labels
数据框每一列的属性。无论其来源是什么(SPSS 或文本),我都需要保持导入对象的结构一致。我需要将value.labels
属性及其值分配给从文本文件导入的数据框中的每个非数字列(因子或字符)。以下是从文本文件导入的数据框的摘录:
> mydf <- data.frame(w = factor(c(1, 2, 3)), x = c("fourth", "fifth", "sixth"),
y = c(9.3, 8.8, 2.6), z = factor(c(7, 8, 9)), stringsAsFactors = FALSE)
我可以逐列执行以下操作:
> attr(mydf$w, "value.labels") <- c(first = "1", second = "2", third = "3")
> attr(mydf$x, "value.labels") <- c(f4 = "fourth", f5 = "fifth", f6 = "sixth")
> attr(mydf$z, "value.labels") <- c(seventh = "7", eighth = "8", ninth = "9")
然后检查:
> attributes(mydf$w)
$levels
[1] "1" "2" "3"
$class
[1] "factor"
$value.labels
first second third
"1" "2" "3"
但是,对于大量数据帧,每个数据帧都包含许多列,这效率不高。是否可以在给定值标签列表的情况下自动执行此操作,例如:
> lst.attr <- list(w = c(first = "1", second = "2", third = "3"),
x = c(f4 = "fourth", f5 = "fifth", f6 = "sixth"), z = c(seventh = "7",
eighth = "8", ninth = "9"))