ruby - 用 FasterCSV 解析这条线的正确方法？

Question

我在 CSV 文件中有以下行，在解析时给我带来了问题：

312,'997639',' 2','John, Doe. "J.D." ',' ','2000 ',' ','Street ','City ','NY','99999','','2010-02-17 19:12:04','2010-02-17 19:12:04';

我正在使用以下参数进行解析：

FasterCSV.foreach(file, {:headers => true, :quote_char => '"', :col_sep => "','"} ) do |row|

但是，由于行列中的“JD”，它在像上面那样的行上爆炸了。如何使用 FasterCSV 正确解析该行？

谢谢！

score 3 · Accepted Answer

在我看来，你:quote_char应该是'，你:col_sep应该是,。在这种情况下：

FasterCSV.foreach(file, {:headers => true, :quote_char => "'", :col_sep => ','} ) ...

score 1 · Accepted Answer

你不能那样做。FasterCSV 只允许选择一种引号字符，而您的应用程序需要两种。没有办法做一些可爱的事情，比如传入一个正则表达式而不是一个字符，因为 FasterCSV 预编译了引号字符转义的匹配器，如下所示：

# prebuild Regexps for faster parsing
esc_col_sep = Regexp.escape(@col_sep)
esc_row_sep = Regexp.escape(@row_sep)
esc_quote   = Regexp.escape(@quote_char)
@parsers = {
  :any_field      => Regexp.new( "[^#{esc_col_sep}]+",
                                 Regexp::MULTILINE,
                                 @encoding ),
  :quoted_field   => Regexp.new( "^#{esc_quote}(.*)#{esc_quote}$",
                                 Regexp::MULTILINE,
                                 @encoding ),
  ...
}

score 0 · Accepted Answer

我无法让 FasterCSV 按照我需要的方式处理这些数据，因此最终结果只是请求使用正确的 CSV 输出重新转储数据。感谢您的尝试！

ruby - 用 FasterCSV 解析这条线的正确方法？

3 回答 3

Related

Reference