linux - 使用 shell 或 diff 命令从两个 csv 文件中提取修改和添加的行

Question

我有两个 csv 文件 F1 和 F2 具有相同顺序的行，我想通过比较 F2 中的文件 F1 和 F2 来提取更改/添加的行。

我尝试了 diff 命令，但我可以看到变化。我如何读取模式并从 F2 中提取线条？

F1（文件 1）：

1234,Joe,pieter,joe@gmail.com,male,22
1235,Shally,Jonse,shally@yahoo.com,female,24
1235,Harry,poter,harry@gmail.com,male,21
1235,Helen,Jairag,helen@gmail.com,female,21
2585,Dinesh,Jairag,helen@gmail.com,female,21

F2（文件 2）：

1234,Joe,pieter,joe@gmail.com,male,22
1235,Shally,Jonse,shally@yahoo.com,female,24
1235,Harry,Potter,harry@gmail.com,male,21
1235,Helen,Jairag,helen@gmail.com,female,21

执行的命令：

diff F2 F1

输出：

3c3
< 1235,Harry,Potter,harry@gmail.com,male,21
---
> 1235,Harry,poter,harry@gmail.com,male,21
4a5
> 2585,Dinesh,Jairag,helen@gmail.com,female,21

文件 F3 中的预期输出：

1235,Harry,poter,harry@gmail.com,male,21
2585,Dinesh,Jairag,helen@gmail.com,female,21

score 3 · Accepted Answer

3

diff --changed-group-format='%<' --unchanged-group-format='' file1 file2

于 2012-08-22T09:29:41.840 回答

score 1 · Accepted Answer

我了解您想从 File2中提取更改/添加的行！
因此，在您的示例中，File2 中只有一个更改的行，而 File2 中没有添加的行。is
的基本调用模式和输出告诉您需要做什么来更新。因此，要了解 File2 的不同之处，您可以将其用作第二个参数。我建议使用选项来. 这为您提供了 File2 中需要在 File1 中更改/添加的每一行，并在第一个 pos 中使用 a：diffdiff old newold-udiff+

diff -u File1 File2

给

--- File1   2012-08-22 11:30:07.000000000 +0200
+++ File2   2012-08-22 11:30:25.000000000 +0200
@@ -1,5 +1,4 @@
 1234,Joe,pieter,joe@gmail.com,male,22
 1235,Shally,Jonse,shally@yahoo.com,female,24
-1235,Harry,poter,harry@gmail.com,male,21
+1235,Harry,Potter,harry@gmail.com,male,21
 1235,Helen,Jairag,helen@gmail.com,female,21
-2585,Dinesh,Jairag,helen@gmail.com,female,21

现在只过滤+除前两行以外的行：

diff -u data1 data2 | \
  awk 'NR > 2 && $0 ~ /^+/ {print substr($0,2)}'

1235,Harry,Potter,harry@gmail.com,male,21

或者反过来：

diff -u data2 data1 | \
  awk 'NR > 2 && $0 ~ /^+/ {print substr($0,2)}'

1235,Harry,poter,harry@gmail.com,male,21
2585,Dinesh,Jairag,helen@gmail.com,female,21

linux - 使用 shell 或 diff 命令从两个 csv 文件中提取修改和添加的行

2 回答 2

Related

Reference