1

我有一些包罗万象的日志文件,格式如下:

timestamp event summary
foo details
account name: userA
bar more details
timestamp event summary
baz details
account name: userB
qux more details
timestamp etc.

我想在日志文件中搜索userB,如果找到,则从前面的时间戳回显到(但不包括)下面的时间戳。可能会有几个与我的搜索匹配的事件。在每场比赛中呼应某种--- start ------ end ---周围的声音会很好。

这将是完美的pcregrep -M,对吧?问题是,GnuWin32 的pcregrep多行正则表达式搜索大文件时崩溃,这些包罗万象的日志可能是 100 兆或更多。

我试过的

到目前为止,我的 hackish 解决方法包括使用grep -B15 -A30查找匹配的行并打印周围的内容,然后将现在更易于管理的块导入pcregrep以进行抛光。问题是有些事件少于 10 行,而有些则 30 或更多;在遇到较短的事件时,我得到了一些意想不到的结果。

:parselog <username> <logfile>

set silent=1
set count=0
set deez=20\d\d-\d\d-\d\d \d\d:\d\d:\d\d
echo Searching %~2 for records containing %~1...

for /f "delims=" %%I in (
    'grep -P -i -B15 -A30 ":\s+\b%~1\b(@mydomain\.ext)?$" "%~2" ^| pcregrep -M -i "^%deez%(.|\n)+?\b%~1\b(@mydomain\.ext|\r?\n)(.|\n)+?\n%deez%" 2^>NUL'
) do (
    echo(%%I| findstr "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL && (
        if defined silent (
            set silent=
            set found=1
            set /a "count+=1"
            echo;
            echo ---------------start of record !count!-------------
        ) else (
            set silent=1
            echo ----------------end of record !count!--------------
            echo;
        )
    )
    if not defined silent echo(%%I
)

goto :EOF

有一个更好的方法吗?我遇到了一个awk看起来很有趣的命令,例如:

awk "/start pattern/,/end pattern/" logfile

...但它也需要匹配中间模式。不幸的是,我对awk语法并不熟悉。有什么建议么?


Ed Morton 建议我提供一些示例日志记录和预期输出。

示例包罗万象

2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730158    Mon Mar 25 08:02:28 2013    529 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:

    Reason:     Unknown user name or bad password

    User Name:  user5f

    Domain:     MYDOMAIN

    Logon Type: 3

    Logon Process:  Advapi  

    Authentication Package: Negotiate

    Workstation Name:   dc3

    Caller User Name:   dc3$

    Caller Domain:  MYDOMAIN

    Caller Logon ID:    (0x0,0x3E7)

    Caller Process ID:  400

    Transited Services: -

    Source Network Address: 169.254.7.86

    Source Port:    40838
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account:  USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:

    Reason:     Account locked out

    User Name:  USER6Q@MYDOMAIN.TLD

    Domain: MYDOMAIN

    Logon Type: 3

    Logon Process:  Advapi  

    Authentication Package: Negotiate

    Workstation Name:   dc3

    Caller User Name:   dc3$

    Caller Domain:  MYDOMAIN

    Caller Logon ID:    (0x0,0x3E7)

    Caller Process ID: 400

    Transited Services: -

    Source Network Address: 169.254.7.89

    Source Port:    55314
2013-03-25 08:02:32 Auth.Notice 169.254.5.62    Mar 25 08:36:38 DC4.mydomain.tld MSWinEventLog  5   Security    201326798   Mon Mar 25 08:36:37 2013    4624    Microsoft-Windows-Security-Auditing     N/A Audit Success   DC4.mydomain.tld    12544   An account was successfully logged on.

Subject:
    Security ID:        S-1-0-0
    Account Name:       -
    Account Domain:     -
    Logon ID:       0x0

Logon Type:         3

New Logon:
    Security ID:        S-1-5-21-606747145-1409082233-725345543-160838
    Account Name:       DEPTACCT16$
    Account Domain:     MYDOMAIN
    Logon ID:       0x1158e6012c
    Logon GUID:     {BCC72986-82A0-4EE9-3729-847BA6FA3A98}

Process Information:
    Process ID:     0x0
    Process Name:       -

Network Information:
    Workstation Name:   
    Source Network Address: 169.254.114.62
    Source Port:        42183

Detailed Authentication Information:
    Logon Process:      Kerberos
    Authentication Package: Kerberos
    Transited Services: -
    Package Name (NTLM only):   -
    Key Length:     0

This event is generated when a logon session is created. It is generated on the computer that was accessed.

The subject fields indicate...
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730162    Mon Mar 25 08:02:30 2013    675 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 9   Pre-authentication failed:

    User Name:  USER8Y

    User ID:        %{S-1-5-21-606747145-1409082233-725345543-3904}

    Service Name:   krbtgt/MYDOMAIN

    Pre-Authentication Type:    0x0

    Failure Code:   0x19

    Client Address: 169.254.87.158
2013-03-25 08:02:32 Auth.Critical   etc.

示例命令

call :parselog user6q \\path\to\catch-all.log

预期结果

---------------start of record 1-------------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account:  USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
---------------end of record 1-------------


---------------start of record 2-------------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:

    Reason:     Account locked out

    User Name:  USER6Q@MYDOMAIN.TLD

    Domain: MYDOMAIN

    Logon Type: 3

    Logon Process:  Advapi  

    Authentication Package: Negotiate

    Workstation Name:   dc3

    Caller User Name:   dc3$

    Caller Domain:  MYDOMAIN

    Caller Logon ID:    (0x0,0x3E7)

    Caller Process ID: 400

    Transited Services: -

    Source Network Address: 169.254.7.89

    Source Port:    55314
---------------end of record 2-------------
4

4 回答 4

1

这就是 GNU awk(用于 IGNORECASE)所需要的全部内容:

$ cat tst.awk
function prtRecord() {
    if (record ~ regexp) {
        printf "-------- start of record %d --------%s", ++numRecords, ORS
        printf "%s", record
        printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
    }
    record = ""
}
BEGIN{ IGNORECASE=1 }
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }

或使用任何 awk:

$ cat tst.awk
function prtRecord() {
    if (tolower(record) ~ tolower(regexp)) {
        printf "-------- start of record %d --------%s", ++numRecords, ORS
        printf "%s", record
        printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
    }
    record = ""
}
/^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
{ record = record $0 ORS }
END { prtRecord() }

无论哪种方式,您都可以在 UNIX 上将其运行为:

$ awk -v regexp=user6q -f tst.awk file

我不知道 Windows 语法,但我希望它非常相似,如果不完全相同的话。

请注意在脚本中使用 tolower() 使比较的两边都小写,因此匹配不区分大小写。如果您可以改为传入正确大小写的搜索正则表达式,则无需在比较的任一侧调用 tolower()。nbd,它可能只是稍微加快脚本速度。

$ awk -v regexp=user6q -f tst.awk file
-------- start of record 1 --------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security
    11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account:  USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
--------- end of record 1 ---------

-------- start of record 2 --------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security
    11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure
dc3 2   Logon Failure:

    Reason:     Account locked out

    User Name:  USER6Q@MYDOMAIN.TLD

    Domain: MYDOMAIN

    Logon Type: 3

    Logon Process:  Advapi

    Authentication Package: Negotiate

    Workstation Name:   dc3

    Caller User Name:   dc3$

    Caller Domain:  MYDOMAIN

    Caller Logon ID:    (0x0,0x3E7)

    Caller Process ID: 400

    Transited Services: -

    Source Network Address: 169.254.7.89

    Source Port:    55314
--------- end of record 2 ---------
于 2013-03-26T02:29:04.073 回答
1

下面有一个不使用 grep 的纯 Batch 解决方案。它定位时间戳行,因为“摘要”词不能存在于其他行中,但如果需要,可以将这个词更改为另一个。

编辑:我将标识时间戳行的单词更改为“Auth。”;我还更改了 FINDSTR 寻求忽略大小写。这是新版本:

@echo off
setlocal EnableDelayedExpansion

:parselog <username> <logfile>
echo Searching %~2 for records containing %~1...

set n=0
set previousMatch=Auth.
for /F "tokens=1* delims=:" %%a in ('findstr /I /N "Auth\. %~1" %2') do (
   set currentMatch=%%b
   if "!previousMatch:Auth.=!" neq "!previousMatch!" (
      if "!currentMatch:Auth.=!" equ "!currentMatch!" (
         set /A n+=1
         set /A skip[!n!]=!previousLine!-1
      )
   ) else (
      set /A end[!n!]=%%a-1
   )
   set previousLine=%%a
   set previousMatch=%%b
)
if %n% equ 0 (
   echo No records found
   goto :EOF
)

if not defined end[%n%] set end[%n%]=-1
set i=1
:nextRecord
   echo/
   echo ---------------start of record %i%-------------
   if !skip[%i%]! equ 0 (
      set skip=
   ) else (
      set skip=skip=!skip[%i%]!
   )
   set end=!end[%i%]!
   for /F "%skip% tokens=1* delims=:" %%a in ('findstr /N "^" %2') do (
      echo(%%b
      if %%a equ %end% goto endOfRecord
   )
   :endOfRecord
   echo ---------------end of record %i%-------------
   set /A i+=1
if %i% leq %n% goto nextRecord

示例命令:

C:>test user6q catch-all.log

结果:

Searching catch-all.log for records containing user6q...

---------------start of record 1-------------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0

Logon account:  USER6Q

Source Workstation: dc3

Error Code: 0xC0000234
---------------end of record 1-------------

---------------start of record 2-------------
2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:

    Reason:     Account locked out

    User Name:  USER6Q@MYDOMAIN.TLD

    Domain: MYDOMAIN

    Logon Type: 3

    Logon Process:  Advapi  

    Authentication Package: Negotiate

    Workstation Name:   dc3

    Caller User Name:   dc3$

    Caller Domain:  MYDOMAIN

    Caller Logon ID:    (0x0,0x3E7)

    Caller Process ID: 400

    Transited Services: -

    Source Network Address: 169.254.7.89

    Source Port:    55314
---------------end of record 2-------------

此方法仅使用一次findstr命令执行来定位所有匹配的记录,然后使用一个附加findstr命令来显示每条记录。请注意,第一个for /F ...命令对findstr "Auth. user.."结果起作用,第二个for /F命令有一个“skip=N”选项和一个 GOTO,一旦显示记录就会中断循环。这意味着 FOR 命令不会减慢程序的速度;该程序的速度取决于 FINDSTR 命令的速度。

但是,第二个for /F "%skip% ... in ('findstr /N "^" %2')命令可能会花费太长时间,因为 FINDSTR 输出结果的大小在它被 FOR 处理之前。如果发生这种情况,我们可以通过另一种更快的方法(例如,将中断的异步管道)修改第二个 FOR。请报告结果。

安东尼奥

于 2013-03-26T04:43:03.670 回答
1

这是我的努力:

@ECHO OFF
SETLOCAL
::
:: Target username
::
SET target=%1
CALL :zaplines
SET count=0
FOR /f "delims=" %%I IN (rojoslog.txt) DO (
  ECHO.%%I| findstr /r "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL
  IF NOT ERRORLEVEL 1 (
    IF DEFINED founduser CALL :report
    CALL :zaplines
  )
  (SET stored=)
  FOR /l %%L IN (1000,1,1200) DO IF NOT DEFINED stored IF NOT DEFINED line%%L (
    SET line%%L=%%I
    SET stored=Y
   )
  ECHO.%%I|FINDSTR /b /e /i /c:"account name: %target%" >NUL
  IF NOT ERRORLEVEL 1 (SET founduser=Y)
)
IF DEFINED founduser CALL :report
GOTO :eof

::
:: remove all envvars starting 'line'
:: Set 'not found user' at same time
::
 :zaplines
(SET founduser=)
FOR /f "delims==" %%L IN ('set line 2^>nul') DO (SET %%L=)
GOTO :eof

:report
IF NOT DEFINED line1000 GOTO :EOF 
SET /a count+=1
ECHO.
ECHO.---------- START of record %count% ----------
FOR /l %%L IN (1000,1,1200) DO IF DEFINED line%%L CALL ECHO.%%line%%L%%
ECHO.----------- END of record %count% -----------
GOTO :eof
于 2013-03-26T03:36:47.853 回答
0

我认为awk就是你所需要的:

awk "/---start of record---/,/---end of record---/ {print}" logfile

如果第一行指标是:

---start of record---

最后一个是:

---end of record---

请注意,没有中间模式匹配,“,”只是两个正则表达式的分隔符。

于 2013-03-26T02:10:25.137 回答