我看过各种关于删除 R 中的特殊字符的帖子(例如这篇文章:Remove all special characters from a string in R?),但没有一个策略对我的问题有效。
我有一份我正在使用 qdap 的 read.transcript() 阅读的成绩单。当我在文档中阅读时,它使带有特殊字符的行看起来像这样:
If anyone knows how to simply change these special characters (i.e <e1><b8><9d> to e), again please feel free to update!
我努力了:
ATL1$X2 <- gsub("[^0-9A-Za-z///,.?()' ]", "", ATL1$X2)
If anyone knows how to simply change these special characters (i.e e1b89d to e), again please feel free to update
但这不会删除特殊字符,也会删除 !
我也试过:
str_replace_all(ATL1$X2, "[^[:alnum:]]", " ")
If anyone knows how to simply change these special characters i e e1 b8 9d to e again please feel free to update
但这更糟糕,并且删除了所有标点符号,但仍然无法解决我的问题。
最后,我也试过:
iconv(ATL1$X2, from = 'UTF-8', to = 'ASCII//TRANSLIT')
If anyone knows how to simply change these special characters (i.e <e1><b8><9d> to e), again please feel free to update!
但这里也没有任何改变。
在理想世界中,输出如下所示:
If anyone knows how to simply change these special characters (i.e e e e to e), again please feel free to update!
因此,特殊字符被读入它们“应该”的样子。如果这是不可能的,老实说,如果它只是删除特殊字符(但不是其他字符,如感叹号)并看起来像这样,我真的可以:
If anyone knows how to simply change these special characters (i.e to e), again please feel free to update!
谢谢!