0

如何使用正则表达式从系统日志消息中提取程序名?我有一个 Java 流处理模块,它接受正则表达式来处理 syslog 消息。

日志行可能是:

2013-10-14T22:05:29+00:00 hostname sshd[6359]: Connection closed by 192.168.1.10
2013-10-14T22:05:29+00:00 hostname sshd:3322 Connection closed by 192.168.1.10
2013-10-14T22:05:29+00:00 hostname sshd/6359 Connection closed by 192.168.1.10
2013-10-14T22:05:29+00:00 hostname sshd Connection closed by 192.168.1.10
2013-10-14T22:05:29+00:00 hostname SSHD[1133] Connection closed by 192.168.1.10
2013-10-14T22:05:29+00:00 hostname SSH.D[6359]: Connection closed by 192.168.1.10

字符串提取过程应该是:取第三个以空格分隔的子字符串,提取以[, :,/或空格结尾的子字符串

因此,在前四个日志样本中,提取的字符串将是sshd、第五个SSHD和第六个SSH.D。这可以用正则表达式吗?

编辑:

我尝试的是((?:[A-Za-z][A-Za-z0-9_.-]+))它似乎有效,但老实说,我修改了一个示例正则表达式并使用在线工具对其进行调整,直到它适合我的用例,但我不确定它是如何工作的。

4

4 回答 4

1

双倍split应该做的工作:

String token = data.split(" +")[2].split("[\\[:/]")[0];
于 2013-10-14T22:24:20.113 回答
0

尝试这样的事情:

String str = line.split(" ")[2].replaceAll("(.+)(\\[|\\:|\\/).+", "$1");

没有测试过。

于 2013-10-14T22:19:27.907 回答
0

如果您的示例数据与您提供的完全一样:

(?:.+?\s){2}([\w\.]+).+$

解释:

(?:.+?\s){2}...匹配到第二个空格

([^\s[:/]+)...匹配任何不是 ' '、':' 或 '/'

.+$...匹配 EOL

您想要的将在捕获的组中\1

于 2013-10-15T01:14:32.510 回答
0

我认为您正在寻找的正则表达式是:

String regex = "([^\\[:/]+).*";

.*表示匹配 0 个或多个任意字符。在点星前面放置一对括号().*会创建一个可以从 Matcher 中选择的组。因为它是第一组括号,所以它被组号 1 引用。括号内是一个表达式,它匹配一个或多个[^]+包含 OP 中指定的字符的否定字符类,特别是“[”、“:” , 和“/”字符。

这是一个测试结果的示例应用程序:

package com.stackexchange.stackoverflow;

import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Question19370191 {
    public static void main(String[] args) {
        String regex = "([^\\[:/]+).*";
        Pattern pattern = Pattern.compile(regex);

        List<String> lines = new ArrayList<>();
        lines.add("2013-10-14T22:05:29+00:00 hostname sshd[6359]: Connection closed by 192.168.1.10");
        lines.add("2013-10-14T22:05:29+00:00 hostname sshd:3322 Connection closed by 192.168.1.10");
        lines.add("2013-10-14T22:05:29+00:00 hostname sshd/6359 Connection closed by 192.168.1.10");
        lines.add("2013-10-14T22:05:29+00:00 hostname sshd Connection closed by 192.168.1.10");
        lines.add("2013-10-14T22:05:29+00:00 hostname SSHD[1133] Connection closed by 192.168.1.10");
        lines.add("2013-10-14T22:05:29+00:00 hostname SSH.D[6359]: Connection closed by 192.168.1.10");

        for(String line : lines) {
            String field = line.split("\\s")[2];
            String extraction = "";
            Matcher matcher = pattern.matcher(field);
            if(matcher.matches()) {
                extraction = matcher.group(1);
            }

            System.out.println(String.format("Field \"%-12s\" Extraction \"%s\"", field, extraction));
        }
    }
}

它输出以下内容:

Field "sshd[6359]: " Extraction "sshd"
Field "sshd:3322   " Extraction "sshd"
Field "sshd/6359   " Extraction "sshd"
Field "sshd        " Extraction "sshd"
Field "SSHD[1133]  " Extraction "SSHD"
Field "SSH.D[6359]:" Extraction "SSH.D"
于 2013-10-14T23:06:35.210 回答