可能重复:
R中两列的连接因子水平
我对 R 相当陌生,我正在努力使我的重新编码脚本更有效和“正确”。我已经尝试搜索论坛,但这让我一无所获 - 也许我使用了错误的术语并错过了它,所以如果问题已经提出,请多多包涵。
我有两个因子变量,我希望将它们合并为一个因子变量。它们来自同一个调查,都衡量教育水平。首先我有两个变量的原因是不幸的调查结构,但这不是重点。要说明的要点是它们是相互排斥的(你只能在一个中)。
我的数据如下所示:
education education2
9th grade <NA>
9th grade <NA>
<NA> 9th grade
<NA> 10th grade
10th grade <NA>
11th grade <NA>
<NA> 9th grade
<NA> 11th grade
<NA> <NA>
我的脚本如下所示:
highest.edu <- vector(length=length(df$education))
a.grade <- which(df$education=="9th grade")
a.grade2 <- which(df$education2=="9th grade")
b.grade <- which(df$education=="10th grade")
b.grade2 <- which(df$education2=="10th grade")
c.grade <- which(df$education=="11th grade")
c.grade2 <- which(df$education=="11th grade")
highest.edu[a.grade] <- as.character(df$education)[a.grade]
highest.edu[a.grade2] <- as.character(df$education2)[a.grade2]
highest.edu[b.grade] <- as.character(df$education)[b.grade]
highest.edu[b.grade2] <- as.character(df$education2)[b.grade2]
highest.edu[c.grade] <- as.character(df$education)[c.grade]
highest.edu[c.grade2] <- as.character(df$education2)[c.grade2]
highest.edu <- factor(highest.edu)
highest.edu[highest.edu =="FALSE"] =NA
highest.edu <- factor(highest.edu)
当然,这还不错,但是当您有两个具有 15 个级别的因子变量几次或更多次时,您就会开始寻找更快的替代方案。
我尝试过这样的事情,但没有任何运气:
a.grade <- which(df$education=="9th grade" | df$education2=="9th grade")
b.grade <- which(df$education=="10th grade" | df$education=="10th grade")
c.grade <- which(df$education=="11th grade" | df$education2=="11th grade")
highest.edu[a.grade] <- as.character(df$education)
[a.grade]|as.character(df$education2)[a.grade]
highest.edu[b.grade] <- as.character(df$education)
[b.grade]|as.character(df$education2)[b.grade]
给我这个: as.character(df$education)[9 年级] 中的错误 | as.character(df$education2)[9th Grade]:只能对数字、逻辑或复杂类型进行操作
有没有办法克服这个问题?
感谢您提前提出任何建议
编辑:
我的目标是这样的:
highest.education
9th grade
9th grade
9th grade
10th grade
10th grade
11th grade
9th grade
11th grade
<NA>
帖子:“R 中两列的连接因子水平”似乎是为了另一个结果
再次谢谢你