Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
给定一个简单的字符串:
t <- "hello world ww ff a wr gj dkjffdkn kuku" VCorpus(VectorSource(t))
我想过滤掉所有 2 和更低长度的子字符串。我怎样才能使用qdap或tm包做到这一点?我知道我可以使用regex它,但是有一个功能可以做到吗?
qdap
tm
regex
使用 package qdapRegex,您可以执行以下操作:
qdapRegex
rm_nchar_words(t, "1,2") [1] "hello world dkjffdkn kuku"