当我在 Windows 下用非本地字符解析 R 代码时,这些字符似乎变成了它们的 Unicode 表示,例如
Encoding('ğ')
# [1] "UTF-8"
parse(text="'ğ'")
# expression('<U+011F>')
parse(text="'ğ'", encoding='UTF-8')
# expression('<U+011F>')
deparse(parse(text="'ğ'")[1])
# [1] "expression(\"<U+011F>\")"
eval(parse(text="'ğ'"))
# [1] "<U+011F>"
由于我的语言环境是简体中文,我可以用中文字符解析代码而不会出现这样的问题,例如
parse(text="'你好'")
# expression('你好')
我的问题是,我怎样才能保留像这个例子中的字母这样的字符ğ
?或者至少在表达之后如何“重建”原始字符deparse()
?
我的会话信息:
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936
attached base packages:
[1] stats graphics grDevices utils datasets methods base