shell - 删除 bash 中匹配前后的行（使用 sed 或 awk）？

Question

我正在尝试从充满事务的文件中删除模式匹配两侧的两行。IE。找到匹配然后删除它之前的两行，然后删除它之后的两行，然后删除匹配。将其写回原始文件。

所以输入数据是

D28/10/2011
T-3.48
PINITIAL BALANCE
M
^

我的模式是

sed -i '/PINITIAL BALANCE/,+2d' test.txt

但是，这只是在模式匹配后删除两行，然后删除模式匹配。我想不出任何合乎逻辑的方法来使用 sed 从原始文件中删除所有 5 行数据。

score 8 · Accepted Answer

一个 awk one-liner 可以完成这项工作：

awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file

测试：

kent$  cat file
######
foo
D28/10/2011
T-3.48
PINITIAL BALANCE
M
x
bar
######
this line will be kept
here
comes
PINITIAL BALANCE
again
blah
this line will be kept too
########

kent$  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}{a[NR]=$0}END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' file
######
foo
bar
######
this line will be kept
this line will be kept too
########

添加一些解释

  awk '/PINITIAL BALANCE/{for(x=NR-2;x<=NR+2;x++)d[x];}   #if match found, add the line and +- 2 lines' line number in an array "d"
      {a[NR]=$0} # save all lines in an array with line number as index
      END{for(i=1;i<=NR;i++)if(!(i in d))print a[i]}' #finally print only those index not in array "d"
     file  # your input file

score 5 · Accepted Answer

sed会做的：

sed '/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

它是这样工作的：

如果 sed 在模式空间中只有一个字符串，它会加入另一个字符串
如果只有两个，则加入第三个
如果它使用 BALANCE 匹配 LINE + LINE + LINE 模式，它会连接两个后续字符串，删除它们并从头开始
如果没有，它会打印模式中的第一个字符串并将其删除并从头开始而不滑动模式空间

为了防止在第一个字符串上出现模式，您应该修改脚本：

sed '1{/PINITIAL BALANCE/{N;N;d}};/\n/!N;/\n.*\n/!N;/\n.*\n.*PINITIAL BALANCE/{$d;N;N;d};P;D'

但是，如果您有另一个PINITIAL BALANCE要删除的字符串，它会失败。但是，其他解决方案也失败了=）

score 2 · Accepted Answer

对于这样的任务，我可能会使用像 Perl 这样的更高级的工具：

perl -ne 'push @x, $_;
          if (@x > 4) {
              if ($x[2] =~ /PINITIAL BALANCE/) { undef @x }
                  else { print shift @x }
          }
          END { print @x }' input-file > output-file

这将从输入文件中删除 5 行。这些行将是匹配前的 2 行、匹配的行和之后的两行。您可以更改要删除修改的总行数@x > 4（这将删除 5 行）和要匹配的行修改$x[2]（这会使第三行上的匹配被删除，因此删除匹配前的两行）。

score 2 · Accepted Answer

一个更简单易懂的解决方案可能是：

awk '/PINITIAL BALANCE/ {print NR-2 "," NR+2 "d"}' input_filename \
    | sed -f - input_filename > output_filename

awk 用于创建一个 sed 脚本，删除有问题的行，并将结果写入 output_filename。

这使用了两个可能比其他答案效率低的过程。

score 1 · Accepted Answer

这可能对您有用（GNU sed）：

sed ':a;$q;N;s/\n/&/2;Ta;/\nPINITIAL BALANCE$/!{P;D};$q;N;$q;N;d' file

score 0 · Accepted Answer

将此代码保存到文件中grep.sed

H
s:.*::
x
s:^\n::
:r
/PINITIAL BALANCE/ {
    N
    N
    d    
}

/.*\n.*\n/ {
    P
    D
}
x
d

并运行如下命令：

`sed -i -f grep.sed FILE`

您可以这样使用它：

sed -i 'H;s:.*::;x;s:^\n::;:r;/PINITIAL BALANCE/{N;N;d;};/.*\n.*\n/{P;D;};x;d' FILE

shell - 删除 bash 中匹配前后的行（使用 sed 或 awk）？

6 回答 6

Related

Reference