r - 在 R 中构建自定义停用词词典时识别重音字符

翻译自：https://stackoverflow.com/questions/58531643 2019-10-23T22:16:32.403

35 次

我正在 R 中构建一个自定义停用词词典以删除重音字符。我认为使用 unicode 引用可以实现这一点，但它不起作用，而且我在考虑不同的解决方案时遇到了麻烦，特别是因为其中一些无法通过运行另一种语言的词典来涵盖。

当前代码：

en_custom_stopwords <- bind_rows(data_frame(word = c("8217", "8216", "le", "de", "en", "el", "8221", "8220", "los", "039", "se", 
                                                     "aei", "\\\\U+00E4"), lexicon = c("custom")), stop_words)

这个词用常规字符找到。

r - 在 R 中构建自定义停用词词典时识别重音字符

0 回答 0

Related

Reference