-2

我是更新问题:

打开一个新的问题线程,以前的问题在结束特定字母时在匹配后删除行。正则表达式在匹配特定行时选择块文本,正则表达式与那些问题不同,我需要根据另一行的匹配结果选择一行。

将帮助 SCHEDULE 行选择块文本并匹配最后一个单词,然后在块文本中再次找到以字母 E 结尾的单词并且始终具有 # 符号。

block text start line SCHEDULE a finish in line END(复制到另一个文件中)

在任何情况下,SCHEDULE 行都有 # 符号

SCHEDULE MANAGER_XA#KGDIVAGBLR 
or
SCHEDULE MANAGER_XA#KGICROBLR_2 
or
SCHEDULE MASTERAGENTS#KGICRO741_AABB
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
or
SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4
or
SCHEDULE MASTERAGENTS#/KA0H/KA0HM00_FACT/KA0HM00_FACT 

END

例如块文本(开始行安排结束行结束):

SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
DESCRIPTION "Added by default."
:
S89COLENG2#/KG34/KG34G43CR3/KG34G1086
 NOP
 FOLLOWS KG34G1085

S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
 FOLLOWS KG34G493
 FOLLOWS KG34G522

S89COLENG2#/KG34/KG34G43CR3/KG34G1086
 NOP
 FOLLOWS KG34G1085

END

和单词 KGICROZZZE 它是匹配的,因为从 SCHEDULE 行中最后一个单词的名称开始并以字母 E 结束

如果在 SCHEDULE 行中最后一个单词在 KGFLABUR_4 中完成(下划线 + 另一个单词),则匹配在下划线之前,可以在文本块 KGFLABURE 中找到

SCHEDULE MANAGER_XA#/XAAA/KAAA/KGFLABUR_4

S89COLENG2#/KG34/KG34G43CR3/KGFLABURE

or 

S89COLENG2#/KG34/KG34G43CR3/KGFLABURE_4

我需要他们2个正则表达式:

  • 一个用于标识块文本中的行,从 SCHEDULE 行中最后一个单词的名称开始,并以字母 E 和相关的 SCHEDULE 块文本结束。

按照块文本示例:

    SCHEDULE MANAGER_XA#/XAAA/KAAA/KGICROZZZ
    DESCRIPTION "Added by default."
    :
    S89COLENG2#/KG34/KG34G43CR3/KG34G1086
     NOP
     FOLLOWS KG34G1085
    
    S89COLENG2#/KG34/KG34G43CR3/KGICROZZZE
     FOLLOWS KG34G493
     FOLLOWS KG34G522
     NOP
    
    S89COLENG2#/KG34/KG34G43CR3/KG34G1086
     NOP
     FOLLOWS KG34G1085
    
    END

或者在这种情况下,在 KAAABBB_CCC 中完成的 SCHEDULE 行匹配在下划线 KAAABBB 之前

    SCHEDULE MANAGER_XA#/XAAA/KAAA/KAAABBB_CCC
    DESCRIPTION "Added by default."
    :
    S89COLENG2#/KG34/KG34G43CR3/KG34G1086
     NOP
     FOLLOWS KG34G1085
    
    S89COLENG2#/KG34/KG34G43CR3/KAAABBBE_CCC
     FOLLOWS KG34G493
     FOLLOWS KG34G522
    
    S89COLENG2#/KG34/KG34G43CR3/KG34G1086
     NOP
     FOLLOWS KG34G1085
    
    END
    
    
  • 用于标识块文本中的行 NOT HAVE 行从 SCHEDULE 行中最后一个单词的名称开始并以字母 E 结束

按照块文本示例:

    SCHEDULE MANAGER_XA#/XAAA/KAAA/KXXXYYYY
    DESCRIPTION "Added by default."
    :
    S89COLENG2#/KG34/KG34G43CR3/KG34G1086
     NOP
     FOLLOWS KG34G1085
    
    S89COLENG2#/KG34/KG34G43CR3/KG34G1020
     NOP
     FOLLOWS KG34G1085
    
    END
    

如果文字太长,我深表歉意,但我还必须编写示例才能更好地解释自己。我也试着缩短它。如果您需要更多信息,请告诉我以更新问题。

问候。

伊塔洛

4

1 回答 1

1

正如我在评论中提到的,这里是一个示例(在 PowerShell 中),说明我如何首先获取所有单独的 SCHEDULE <--> END 块,然后使用检查匹配的正则表达式将它们分成匹配和不匹配的组。

# Read lines from file into $text variable
$text = Get-Content -Raw -Path c:\temp\powershell\schedule.log

# Use regex class to find all SCHEDULE <--> END blocks in $text
$scheduleBlockMatches = [regex]::matches($text, '(?sm)SCHEDULE.*?END')

# Define matching pattern in a variable called $matchPattern
$matchPattern = '(?m)SCHEDULE.*[\/#]([^_\n]+)(?=(?:[\n]|.)+\1E)(?:\n|.)+?(^.*?\1E.*)(?:\n|.)+?END'

# For each SCHEDULE <--> END block in $scheduledBlockMatches use Where() to see if it matches pattern
# Specifying split as an argument to Where() will give us both both true and false sets
# which will be placed in our specified variables '$matched' and '$notMatched'
$matched, $notmatched = $scheduleBlockMatches.Value.Where({ $_ -match $matchPattern }, 'split')

# Create a simple object to display the counts of and first examples of each collection
[PSCustomObject]@{
    TotalLinesInLog     = ($text -split '\n').Count
    TotalScheduleBlocks = $scheduleBlockMatches.Count
    MatchedCount        = $matched.Count
    NotMatchedCount     = $notmatched.Count
    FirstMatched        = $matched[0]
    FirstNotMatched     = $notmatched[0]
}

自定义对象的输出如下所示

TotalLinesInLog     : 228696
TotalScheduleBlocks : 15120
MatchedCount        : 9450
NotMatchedCount     : 5670
FirstMatched        : SCHEDULE MASTERAGENTS#KA96G01
                      DESCRIPTION "Added by composer."
                      :

                      S89COLENG2#/KA96/KA96G01/KA96G065
                       FOLLOWS KA96G030

                      S89COLENG2#/KA96/KA96G01/KA96G01E
                       FOLLOWS KA96G036
                       FOLLOWS KA96G038

                      MASTERAGENTS#SBP_KA96G114_KA96G09_KA96G112
                       FOLLOWS KA96G114

                      END
FirstNotMatched     : SCHEDULE MASTERAGENTS#KA96GAA_5
                      DESCRIPTION "Added by composer."
                      :

                      S89COLENG2#/KA96/KA96G02/KA96G091
                       FOLLOWS KA96G090


                      S89COLENG2#/KA96/KA96G02/KA96G096
                       FOLLOWS KA96G060

                      END
于 2021-10-26T04:52:14.867 回答