4

出于某种原因,csv 文件中有一些带有“非法引用”错误的行,例如:

1336481227,178.108.171.183,3.2.0,9700132ccc02e12a,c083b5d2-ec92-486f-a5b3-512dba1ce4ae,invoke_action,"{""timestamp"":""2012-05-08 13:47:26""}"

1336481227,178.108.171.183,3.2.0,9700132ccc02e12a,c083b5d2-ec92-486f-a5b3-512dba1ce4ae,invoke_action,{""timestamp"":""2012-05-08 13:47:27""}

第一行是正确的。但是第二行的最后一个字段 {""timestamp"":""2012-05-08 13:47:27""} 缺少大括号外的双引号,所以当我尝试

CSV.foreach(csv_file_path) do |row|
    puts "======================="
    puts row
    puts "======================="
end

我有错误

=======================
1336481227
178.108.171.183
3.2.0
9700132ccc02e12a
c083b5d2-ec92-486f-a5b3-512dba1ce4ae
invoke_action
{"timestamp":"2012-05-08 13:47:26","a":"b"}
=======================
#<CSV::MalformedCSVError: Illegal quoting in line 2.>

无论如何我可以用这样的问题修复这一行,或者在发生错误时跳过它?

编辑:如果我尝试

CSV.foreach(csv_file_path, :quote_char => "\'") do |row|
    puts "======================="
    puts row
    puts "======================="
end

第一行的 JSON 格式值被破坏了:

=======================
1336481227
178.108.171.183
3.2.0
9700132ccc02e12a
c083b5d2-ec92-486f-a5b3-512dba1ce4ae
invoke_action
"{""timestamp"":""2012-05-08 13:47:26""
""a"":""b""}"
=======================
=======================
1336481227
178.108.171.183
3.2.0
9700132ccc02e12a
c083b5d2-ec92-486f-a5b3-512dba1ce4ae
invoke_action
{""timestamp"":""2012-05-08 13:47:27""}
=======================
4

2 回答 2

3

尝试

CSV.foreach(csv_file_path, :quote_char => "\'")
于 2012-05-15T05:47:27.150 回答
0

我认为最简单的方法是使用双 gsub

require 'csv'
line = "1336481227,178.108.171.183,3.2.0,9700132ccc02e12a,c083b5d2-ec92-486f-a5b3-512dba1ce4ae,invoke_action,\"{\"\"timestamp\"\":\"\"2012-05-08 13:47:26\"\",\"\"a\"\":\"\"b\"\"}\""
line.gsub!('""', '%tmp%')
csv = CSV.new(line).each.map do |line|
  line.map do |value|
    value.gsub!('%tmp%', '""')
    value
  end
end

puts csv.inspect
# => [["1336481227", "178.108.171.183", "3.2.0", "9700132ccc02e12a", "c083b5d2-ec92-486f-a5b3-512dba1ce4ae", "invoke_action", "{\"\"timestamp\"\":\"\"2012-05-08 13:47:26\"\",\"\"a\"\":\"\"b\"\"}"]]
于 2012-05-15T08:14:55.053 回答