目的是折叠/重新分配级别作为清理数据集的一部分。
这是示例:
df <- data.frame(V1 <- c("cat","lion","cat","beast","cat"),
V2 <- c("nice and grumpy","angry","old,but also nice","empty","has friends"),
stringsAsFactors = F); colnames(df) <- c("V1","V2")
>df
V1 V2
1 cat nice and grumpy
2 lion angry
3 cat old,but also nice
4 beast empty
5 cat has friends
兴趣水平是cat
; 这些是条目:
parse1 <- V1[grepl("cat",V1)]
#[1] "cat" "cat" "cat"
从那里开始,我们的想法是在 中搜索一个属性V2
,nice
在该属性上,该级别cat
将被重命名为nice cat
。此搜索找到 2 个感兴趣的条目V2
:
df.sub <- subset(df,V1=="cat",select=V1:V2)
parse2 <- df.sub$V2[grep("([Nn]ice)",df.sub$V2)]
#[1] "nice and grumpy" "old,but also nice"
理想的最终结果将df
转变为:
V1 V2
1 nice cat nice and grumpy
2 lion king
3 nice cat old,but also nice
4 beast empty
5 cat has friends
任何想法如何实现这一目标?非常感谢。