我有一个包含用 UTF-8 保存的汉字的 csv 文件。
电视项目价格 5000
第一行是标题,第二行是数据。换句话说,它是一个两个向量。
我读了这个文件如下:
amatrix<-read.table("test.csv",encoding="UTF-8",sep=",",header=T,row.names=NULL,stringsAsFactors=FALSE)
但是,输出包括标题的未知标记,即 XUFEFF
That is the byte order mark sometimes found in Unicode text files. I'm guessing you're on Windows, since that's the only popular OS where files can end up with them.
What you can do is read the file using readLines
and remove the first two characters of the first line.
txt <- readLines("test.csv", encoding="UTF-8")
txt[1] <- substr(txt[1], 3, nchar(txt[1]))
amatrix <- read.csv(text=txt, ...)