regex - 使用 sed 将正则表达式匹配和匹配前的行替换为“-=+ REMOVED +=-”

Question

我有一个大的，我的意思是超过 190 万行的大日志。我需要正则表达式来替换所有不包含单词“Never”和前一行的行，然后替换为-=+ REMOVED +=-。下面是日志中的一个示例。

2013-09-17-01:02:43 User: ID_123456@some.tld  
2013-09-17-01:02:43 Last login time: Never  
2013-09-17-01:02:43 User: ID_123458@some.tld  
2013-09-17-01:02:43 Last login time: 2013-09-16  
2013-09-17-01:02:43 User: ID_123423@some.tld  
2013-09-17-01:02:43 Last login time: 2013-09-15

所以用户有一个登录时间，删除该行和电子邮件地址之前的行。最终输出应该看起来像

2013-09-17-01:02:43 User: ID_123456@some.tld  
2013-09-17-01:02:43 Last login time: Never  
-=+ REMOVED +=-  
-=+ REMOVED +=-  
-=+ REMOVED +=-  
-=+ REMOVED +=-

应该很容易，但在过去的一个小时里，我一直在绞尽脑汁。

我更喜欢使用 sed，因为我想了解更多信息，但我对任何事情都持开放态度......

score 5 · Accepted Answer

5

这可能对您有用（GNU sed）：

 sed '$!N;/\n.*Never/!s/.*/-=+ REMOVED +=-/mg'  file

于 2013-09-17T14:47:53.007 回答

score 1 · Accepted Answer

这可以使它：

$ rm="-=+ REMOVED +=-"
$ awk -v rm="$rm" 'BEGIN{OFS="\n"}NR%2{a=$0; next} $0~/Never/ {print a,$0; next}{print rm,rm}' a
2013-09-17-01:02:43 User: ID_123456@some.tld  
2013-09-17-01:02:43 Last login time: Never  
-=+ REMOVED +=-
-=+ REMOVED +=-
-=+ REMOVED +=-
-=+ REMOVED +=-

解释

-v rm="$rm"用于存储“删除”的文本。
BEGIN{OFS="\n"}定义行分隔符。
NR%2{a=$0; next}如果是奇数行，则将该行存储在avar 中。 $0~/Never/ {打印一个,$0; next}{print rm,rm}'两次删除in case the line contains "Never", print the previous line (stored in的文本。and the current one). Otherwise, print

score 0 · Accepted Answer

另一个 awk

awk '/User:/ {u=$0} /Last/ {if (/Never/) {print u"\n"$0} else {print v"\n"v}}' v="-=+ REMOVED +=-" file

regex - 使用 sed 将正则表达式匹配和匹配前的行替换为“-=+ REMOVED +=-”

3 回答 3

解释

Related

Reference