This is what I was doing:
csv = CSV.open(file_name, "r")
I used this for testing:
line = csv.shift
while not line.nil?
puts line
line = csv.shift
end
And I ran into this:
ArgumentError: invalid byte sequence in UTF-8
I read the answer here and this is what I tried
csv = CSV.open(file_name, "r", encoding: "windows-1251:utf-8")
I ran into the following error:
Encoding::UndefinedConversionError: "\x98" to UTF-8 in conversion from Windows-1251 to UTF-8
Then I came across a Ruby gem - charlock_holmes. I figured I'd try using it to find the source encoding.
CharlockHolmes::EncodingDetector.detect(File.read(file_name))
=> {:type=>:text, :encoding=>"windows-1252", :confidence=>37, :language=>"fr"}
So I did this:
csv = CSV.open(file_name, "r", encoding: "windows-1252:utf-8")
And still got this:
Encoding::UndefinedConversionError: "\x8F" to UTF-8 in conversion from Windows-1252 to UTF-8