java - 日志分析：按时差查找行

Question

我有一个很长的日志文件，用log4j, 10threads写入日志。我正在寻找可以找到用户等待很长时间的行的日志分析器工具（即同一线程的日志条目之间的差异超过一分钟）。

PS我正在尝试使用OtrosLogViewer，但它通过某些值（例如，通过线程ID）进行过滤，并且不会在行之间进行比较。

PPS 新版本的 OtrosLogViewer 有一个“Delta”列，用于计算 adj 日志行之间的差异（以 ms 为单位）

谢谢你

score 3 · Accepted Answer

这个简单的 Python 脚本可能就足够了。为了测试，我分析了我的本地 Apache 日志，顺便说一句，它使用通用日志格式，因此您甚至可以按原样重用它。我只是计算两个后续请求之间的差异，并打印超过某个阈值（在我的测试中为 1 秒）的增量的请求行。您可能希望将代码封装在一个函数中，该函数也接受带有线程 ID 的参数，因此您可以进一步过滤

#!/usr/bin/env python
import re
from datetime import datetime

THRESHOLD = 1

last = None
for line in open("/var/log/apache2/access.log"):
    # You may insert here something like
    # if not re.match(THREAD_ID, line):
    #   continue
    # Python does not support %z, hence the [:-6]
    current = datetime.strptime(
        re.search(r"\[([^]]+)]", line).group(1)[:-6],
        "%d/%b/%Y:%H:%M:%S")
    if last != None and (current - last).seconds > THRESHOLD:
        print re.search('"([^"]+)"', line).group(1)
    last = current

score 2 · Accepted Answer

根据@Raffaele 的回答，我对任何日志文件进行了一些修复（跳过不以请求日期开头的行，例如 Jenkins 控制台日志）。此外，添加了 Max / Min Threshold 以根据持续时间限制过滤掉行。

#!/usr/bin/env python
import re
from datetime import datetime

MIN_THRESHOLD = 80
MAX_THRESHOLD = 100

regCompile = r"\w+\s+(\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d).*"
filePath = "C:/Users/user/Desktop/temp/jenkins.log"

lastTime = None
lastLine = ""

with open(filePath, 'r') as f:
    for line in f:   
        regexp = re.search(regCompile, line)
        if regexp:
            currentTime = datetime.strptime(re.search(regCompile, line).group(1), "%Y-%m-%d %H:%M:%S")

            if lastTime != None:
                duration = (currentTime - lastTime).seconds
                if duration >= MIN_THRESHOLD and duration <= MAX_THRESHOLD:
                    print ("#######################################################################################################################################")
                    print (lastLine)
                    print (line)
            lastTime = currentTime
            lastLine = line
f.closed

score 0 · Accepted Answer

0

Apache Chainsaw有一个时间增量列。

于 2022-01-18T16:08:51.713 回答

java - 日志分析：按时差查找行

3 回答 3

Related

Reference