regex - Use grep to match and erase a pattern and its previous line in a large chunck of text

Question

I have a very large text file which contains data similar to the following:

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

klutzy/NN little/JJ ##scene/NN## where/WRB 1/Z brave/JJ french/JJ man/NN refuse/VBZ to/TO sit/VB down/RP for/IN fear/NN of/IN be/VBG discover/VBN ./Fp
======>match found: \#\#\swhere\/WRB\s

I would like to use grep to match and erase all those lines that contain a line of "text" followed immediately after a new line character with =====>match found: , as in:

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

and end with a newline character.

Thus, according to the previous example, I'd like to run grep and obtain the following output

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

I have already tried: grep -E -v '^.+\n======>match found:.+$' file.txt

as suggested here by appending the regex .+*\n to the command to include the previous line, but it is not working, any suggestions?

score 1 · Accepted Answer

此sed命令接近您想要的：

$ sed -n 'N;/\n======>match found:/d; P;D' textfile 
he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

score 0 · Accepted Answer

由于传统的 grep 实现一次只考虑一行这一事实，多行 grepping 变得复杂，因此添加\n到您的模式中没有意义。

如果您有 pcregrep 可用的多行匹配，可以使用以下-M标志完成：

pcregrep -Mv '^.+\n======>match found:.+$'

输出：

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

regex - Use grep to match and erase a pattern and its previous line in a large chunck of text

2 回答 2

Related

Reference