0

我收到一个制表符分隔的文件,并以默认字符集“Unicode”打开。据我了解,“Unicode”可能指的是 UTF-16。

当我尝试使用此命令打开此文件时:

CSV.foreach(file, :col_sep => "\t", :headers => true) do |column|
    puts column[0]
end

我收到以下错误:

invalid byte sequence in UTF-8

我知道如果我打开这个文件并将其保存为“UTF-8”它会正常工作,但我不能手动打开文件并每次都这样做。我怎样才能绕过这个错误?

编辑:

传入时:encoding: 'UTF-16BE'根据下面的 stefans 请求,我收到:

invalid byte sequence in UTF-16BE

也许我传递了错误的编码选项?

编辑2:

传入时:encoding => 'ISO-8859-1',我收到此错误:

Illegal quoting in line 1. (CSV::MalformedCSVError)

我的文件中的第 1 行如下:

"Status"    "Internal ID"   "Language"  "Created At"    "Updated At"    "IP Address"    "Location"  "Username"  "GET Variables" "Referrer"  "Number of Saves"   "Weighted Score"    "Completion Time"   "Invite Code"   "Invite Email"  "Invite Name"   "Invite: branchid"  "Invite: lastname"  "Invite: clientname"    "Invite: membershipid"  "Invite: clientid"  "Invite: dateofbirth"   "Invite: membershiptype"    "Invite: branch"    "Invite: unitid"    "Invite: shortname" "Invite: changedatetime"    "Invite: homephone" "Collector" 

我尝试输入 aquote_char但我得到了同样的错误。我的代码现在看起来像这样:

CSV.foreach(file, :col_sep => "\t", :encoding => 'ISO-8859-1', :quote_char => '"', :headers => true) do |column|
    puts column[0]
end
4

0 回答 0