0
4

2 回答 2

1

I tried to just encode the JSON part data$fullname since that seems to be the problem. I tried Encoding(data$fullname) = "UTF-8" at first which didnt resolve the situation. But then I switched to latin1and the spreadsheet happend to be written. Thanks for your pushy ideas! :)

于 2014-08-16T01:08:54.403 回答
0

As discussed above, it would probably be better to use

content(GET(url), as="parsed", encoding="UTF-8")

this takes advantage of the httr package's ability to decode the content for you.

Note that when you see <U+2800> in output, that does not mean that those exact characters appear in the string. That's R's way of escaping unicode characters just like it adds extra slashed to escape other special characters like \r. You are seeing those characters because if your locale settings. You didn't mention what OS you are on. The Mac will use UTF-8 by default and should try to display those characters. I don't have access to a windows machine to test what the default is there. They seem to appear as "" when the locale "LC_ALL" is set to "C". This returns

Sys.getlocale()
# [1] "C/C/C/C/C/en_US.UTF-8"

x <- "\u2800\u2800\u2800Jenny";
print(x)
# [1] "<U+2800><U+2800><U+2800>Jenny"

so there aren't actually less-than/greater-than symbols or capital U's in the string. That's just how the C encoding will display them. If you want to remove non-ascii characters, you could do

iconv(x, from="UTF-8", to="ASCII", sub="")
# [1] "Jenny"

Excel may very well be able to handle other types of encoding but I personally don't know how that's managed with XLConnect

于 2014-08-16T06:10:49.073 回答