linux - 仅当所有模式在同一订单上匹配时才提取多行

Question

我遇到了这里问的类似困难。

我的 Linux 日志文件（示例日志文件）包含以下条目，我想 grep 行 '<code>Total Action Failed :' 和 '<code>Total Action Processed:' 仅当这两行后跟包含字符串 ' > Processing file: R' 的行。

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
    =========
    Overview:
        Total Action          : 100
        Total Action Failed   : 0
        Total Action Processed: 100

INF----BusinessLog:08/06/19 20:44:35 > Processing file:  R333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
    =========
    Overview:
        Total Action          : 50
        Total Action Failed   : 0
        Total Action Processed: 50

尝试使用pcregrep前面问题中给出的解决方案，如下所示：

/opt/pdag/bin/pcregrep -M  '> Processing file:  R.*(\n|.)*Total Action Failed   :.*(\n|.)*Total Action Processed:'" $log_path/LogFile.log

我有以下两个问题：

(1) 上面的命令返回模式行之间存在的所有行——这不是必需的

(2) 如果日志文件包含以下 ( > Processing file: Z) 而不是 ( > Processing file: R) 的条目，则上述 pcregrep 命令不会给出准确的结果。

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:33 > Data
    =========
    Overview:
        Total Action          : 100
        Total Action Failed   : 0
        Total Action Processed: 100

INF----BusinessLog:08/06/19 20:44:35 > Processing file:  Z333333333.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:35 > Data
    =========
    Overview:
        Total Action          : 50
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107, and creates the reports.
Line2
Line3
Line4
INF----BusinessLog:08/06/19 20:44:54 > Data
    =========
    Overview:
        Total Action          : 300
        Total Action Failed   : 45
        Total Action Processed: 300

有人可以帮我找到解决这个问题的方法吗？

当所有模式以相同的顺序匹配时，我只需要如下三行；此外，第一个模式> Processing file: R和第二个模式之间的行数Total Action Failed :不同，并不总是 3 行。

INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107
            Total Action Failed   : 45
            Total Action Processed: 300

score 1 · Accepted Answer

我认为您正在尝试创建一个满足您的要求的正则表达式，而实际上您真正想要做的只是将每个块的第一行和最后两行以包括> Processing file: R. 鉴于此，在每个 UNIX 机器上的任何 shell 中使用任何 awk：

$ awk -v OFS='\n' '
    /> Processing file:[[:space:]]*R/ { if (h) print h, y, z; h=$0 }
    NF { y=z; z=$0 }
    END { print h, y, z }
' file
INF----BusinessLog:08/06/19 20:44:33 > Processing file:  R1111111.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 0
        Total Action Processed: 50
INF----BusinessLog:08/06/19 20:44:45 > Processing file:  R555555555.R222222222.TEST0107, and creates the reports.
        Total Action Failed   : 45
        Total Action Processed: 300

如果这不是您想要的，那么更新您的问题以阐明您的要求并提供上述不适用于的示例，我们可以发布简单的、可移植的 awk 解决方案来代替。

linux - 仅当所有模式在同一订单上匹配时才提取多行

1 回答 1

Related

Reference