0
if ( $_ =~ /^(\d+)_[^,]+,"",(.+)"NR"(.+)"0","",""/ )                    
{ }
elsif ( $_ =~ /^[^_]+_[^,]+,"([\d\/]+)","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+",
               "[^"]+","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+",.+/x    )

在第一次,是重复数字一次或多次,然后是_,然后重复任何不等于的字符,一次或多次,“”,做什么?它看起来是一个空格还是逗号是某种转义字符,有点困惑并且没有能力在这台机器上测试它。正则表达式中通常有逗号吗?也是一开始的^,它是一个锚还是否定整个事情?

第二种说法更糟

4

2 回答 2

6

CPAN 模块YAPE::Regex::Explain可用于解析和解释您不理解的 Perl 正则表达式。这是您的第一个正则表达式的输出:

(?-imsx:^(\d+)_[^,]+,"",(.+)"NR"(.+)"0","","")

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  _                        '_'
----------------------------------------------------------------------
  [^,]+                    any character except: ',' (1 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  ,"",                     ',"",'
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  "NR"                     '"NR"'
----------------------------------------------------------------------
  (                        group and capture to \3:
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \3
----------------------------------------------------------------------
  "0","",""                '"0","",""'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

你也可以使用该模块来解析你的第二个正则表达式(我不会在这里转储它,因为解释会很长而且非常多余。)但是如果你想试一试,试试这个:

use strict;
use warnings;
use YAPE::Regex::Explain;

my $re = qr/^[^_]+_[^,]+,"([\d\/]+)","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+",
           "[^"]+","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+","[^"]+",.+/x;

print YAPE::Regex::Explain->new( $re )->explain;
于 2013-06-25T19:56:34.227 回答
2
  • 一切如你所说。
  • ,"",匹配一个逗号,后跟两个双引号,后跟一个逗号。
  • 逗号在正则表达式模式中并不重要。
  • ^是一个锚(字符串的开头)。[^...]它仅在字符类 ( )的第一个字符时取反。

更好的方法是使用Text::CSV_XS将行解析为字段,然后匹配获得的值。

if (   my ($num) = $row->[0] =~ /^(\d+)_[^,]+\z/
   and $row->[1] eq ""
   and ...
) {
   ...
}
elsif (... ) {
   ...
}
于 2013-06-25T19:55:05.320 回答