ruby - 应用转换后修改一系列行

Question

应用转换后修改一系列行

我想编写一个 kiba 转换，允许我为特定数量的行插入相同的信息。在这种情况下，我有一个包含子标题的 xls 文件，并且该子标题也包含数据，如下所示：

Client: John Doe, Code: 1234
qty, date, price
1, 12/12/2017, 300.00
6, 12/12/2017, 300.00
total: 2100
Client: Nick Leitgeb, Code: 2345
qty, date, price
1, 12/12/2017, 100.00
2, 12/12/2017, 200.00
2, 12/12/2017, 50.00
total: 600
Client: …..

为了提取相关数据，我使用下一个转换，它返回与提供的两个正则表达式中的至少一个匹配的行（日期或“客户”字）

transform, SelectByRegexes regex: [/\d+\/\d+\/\d+/, /Client:/], matches: 1

这会给我下一个结果：

Client: John Doe, Code: 1234
1, 12/12/2017, 300.00
6, 12/12/2017, 300.00
Client: Nick Leitgeb, Code: 2345
1, 12/12/2017, 100.00
2, 12/12/2017, 200.00
2, 12/12/2017, 50.00
…..

现在我有了我想要的信息，我需要为每个子行复制客户端和代码，并删除子标题

John Doe, 1234, 1, 12/12/2017, 300.00
John Doe, 1234, 6, 12/12/2017, 300.00
Nick Leitgeb, 2345, 1, 12/12/2017, 100.00
Nick Leitgeb, 2345, 2, 12/12/2017, 200.00
Nick Leitgeb, 2345, 2, 12/12/2017, 50.00

我能想到的唯一方法是直接在块上source或pre_process块中进行，但需要之前使用的转换才能显示必要的数据，是否可以在源/pre_process 中使用转换类块？或在转换中操作多行？

score 3 · Accepted Answer

Kiba作者在这里！感谢您使用 Kiba。你是对的，你可以通过一个专门的实现这一点source，但我个人更喜欢使用以下模式：

last_seen_client_row = nil
logger = Logger.new(STDOUT)

transform do |row|
  # detect "Client/Code" rows - pseudo code, adjust as needed
  if row[0] =~ /\AClient:\z/
    # this is a top-level header, memorize it
    last_seen_client_row = row
    logger.info "Client boundaries detected for client XXX"
    next # remove the row from pipeline
  else
    # assuming you are working with arrays (I usually prefer Hashes though!) ; make sure to dupe the data to avoid
    last_seen_client_row.dup + row
  end
end

您当然可以将该块转换为更可测试的类，我建议您对行检测非常严格，以确保您检测到格式的任何更改并快速失败。

希望这可以帮助！

ruby - 应用转换后修改一系列行

1 回答 1

Related

Reference