regex - Perl中的否定正则表达式

Question

我将文本文件拆分为块，以便使用正则表达式提取那些不包含特定行的块。文本文件如下所示：

[Term]  
id: id1  
name: name1  
xref: type1:aab  
xref: type2:cdc  

[Term]  
id: id2  
name: name2  
xref: type1:aba  
xref: type3:fee

几天前有人帮助我，向我展示了如何提取那些包含某个正则表达式的块（例如“xref：type3”）：

while (<MYFILE>) {
  BEGIN { $/ = q|| }
    my @lines = split /\n/;
    for my $line ( @lines ) {
        if ( $line =~ m/xref:\s*type3/ ) {
            printf NEWFILE qq|%s|, $_;
            last;
        }
    }
}

现在我想将所有块写入一个不包含“xref：type3”的新文件中。我试图通过简单地否定正则表达式来做到这一点

if ( $line !~ m/xref:\s*type3/ )

或者通过使用来否定 if 语句

unless ( $line =~ m/xref:\s*type3/ )

不幸的是，它不起作用 - 输出文件与原始文件相同。任何想法我做错了什么？

score 3 · Accepted Answer

你有：

对于每一行，如果此行与模式不匹配，则打印此块。

但你想要：

对于每一行，如果块中没有其他行与模式匹配，则打印此行。

因此，在检查块中的每一行之前（或在找到匹配行之前的所有行），您不能开始打印该块。

local $/ = q||;
while (<MYFILE>) {
    my @lines = split /\n/;

    my $skip = 0;
    for my $line ( @lines ) {
        if ( $line =~ m/^xref:\s*type3/ ) {
            $skip = 1; 
            last;
        }
    }

    if (!$skip) {
        for my $line ( @lines ) {
            print NEWFILE $line;
        }
    }
}

但是没有必要分成几行。我们可以一次检查并打印整个块。

local $/ = q||;
while (<MYFILE>) {
    print NEWFILE $_ if !/^xref:\s*type3/m;
}

（注意/m要^匹配任何行的开头。）

score 1 · Accepted Answer

不要逐行处理记录。使用段落模式：

{   local $/ = q();
    while (<MYFILE>) {
        if (! /xref:\s*type3/ ) {
            printf NEWFILE qq|%s|, $_;
            last;
        }
}

score 1 · Accepted Answer

问题是您正在使用unlesswith !~which 被解释为好像$line不匹配执行此操作。（双重否定）

当使用unless带有普通模式匹配运算符的块时，=~您的代码工作得很好，也就是说，我将第一个块视为输出，因为它不包含 type3。

LOOP:
while (<$MYFILE>) {
  BEGIN { $/ = q|| }
    my @lines = split /\n/;
    for my $line ( @lines ) {
        unless ( $line =~ m/xref:\s*type3/ ) {
            printf qq|%s|, $_;
            last LOOP;
        }
  }
}

# prints
# [Term]
# id: id1
# name: name1
# xref: type1:aab
# xref: type2:cdc

regex - Perl中的否定正则表达式

3 回答 3

Related

Reference