0

我正在尝试使用以下正则表达式验证 excel 公式样式:

=SUM\(((?:\w+\d+)(?::\w+\d+)?)((?:,\w+\d+)(?::\w+\d+)?)*\)

在这个来源:

应该通过

=SUM(A1,A11:A212,A12:A56,A342:A12,A3)
=SUM(A11:A12,A12:a12,A34:A3)
=SUM(A1,A2,A3)
=SUM(A1)

应该失败

=SUM(A11:A212:A2,A12:A56,A4,A342:A12)

而且我的验证部分正在工作,但我不知道如何对每个逗号分隔值进行分组。他们应该是:

我希望它们如何分组:

=SUM(A1,A11:A12,A12:A56,A3)     // Groups: $1 = A1 $2 = A11:A12 $3 = A12:A56 $4 = A3
=SUM(A11:A12,A10:A12,A34:A3)    // Groups: $1 = A11:A12 $2 = A10:A12 $3 = A34:A3
=SUM(A1,A2,A3)                  //Groups: $1 = A1 $2 = A2 $3 = A3
=SUM(A1)                        //Groups: $1 = A1

他们目前的分组方式:

=SUM(A1,A11:A12,A12:A56,A3)     // Groups: $1 = A1 $2 = A3
=SUM(A11:A12,A10:A12,A34:A3)    // Groups: $1 = A11:A12 $2 = A34:A3
=SUM(A1,A2,A3)                  //Groups: $1 = A1 $2 = A3
=SUM(A1)                        //Groups: $1 = A1

注意,它对第一个和最后一个进行分组。我对 REGEX 很陌生,所以如果我在这里做一些糟糕的事情,请指出我正确的方向。谢谢!

4

2 回答 2

1

这是不可能的:(...)(?:,(...))+(2 组)总是会产生 2 个匹配,无论+匹配多少。

您需要(至少)两个步骤来完成:

value       :=  /\w+\d+(?::\w+\d+)?/

value_list  :=  /value(?:,value)*/

expression  :=  /=SUM\((value_list)\)/

现在匹配expression(the value_list) 中的第 1 组,并查找value此匹配中的所有匹配项。

PHP 中的快速演示:

$text = 'should pass

=SUM(A1,A11:A212,A12:A56,A342:A12,A3)
=SUM(A11:A12,A12:a12,A34:A3)
=SUM(A1,A2,A3)
=SUM(A1)

should fail

=SUM(A11:A212:A2,A12:A56,A4,A342:A12)';

$value      = "\w+\d+(?::\w+\d+)?";
$value_list = "$value(?:,$value)*";
$expression = "=SUM\(($value_list)\)";

preg_match_all("/$expression/", $text, $matches);

// iterate over $value_list from $expression (group 1)
foreach($matches[1] as $group1) {
  preg_match_all("/$value/", $group1, $m);
  print_r($m);
}

印刷:

大批
(
    [0] => 数组
        (
            [0] => A1
            [1] => A11:A212
            [2] => A12:A56
            [3] => A342:A12
            [4] => A3
        )

)
大批
(
    [0] => 数组
        (
            [0] => A11:A12
            [1] => A12:a12
            [2] => A34:A3
        )

)
大批
(
    [0] => 数组
        (
            [0] => A1
            [1] => A2
            [2] => A3
        )

)
大批
(
    [0] => 数组
        (
            [0] => A1
        )

)
于 2013-02-20T21:48:10.177 回答
0

我实际上会先拆分字符串。就像是:

sub IsFormulaValid
{
    my $str = $_[0];
    (my $match) = $str =~ /^=SUM\(([^)]+)\)$/;
    my @sumArgs = split(/,\s*/, $match);
    my $valid = 1;
    foreach(@sumArgs){
        if($_ !~ /^[a-z]+\d+(?::[a-z]+\d+){0,1}$/i){
            $valid = 0;
            last;
        }
    }
    return $valid;
}

请注意,您还可以检查匹配本身的有效性,以及设置时 @sumArgs > 0 $valid。使用您的输入在 perl 中进行测试:

my @testInput;

push(@testInput,'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)');
push(@testInput,'=SUM(A11:A12,A12:a12,A34:A3)');
push(@testInput,'=SUM(A1,A2,A3)');
push(@testInput,'=SUM(A1)');
push(@testInput,'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)');

foreach(@testInput){
    print "'$_'\n  ";
    print 'NOT ' if !IsFormulaValid($_);
    print "VALID\n\n";
}

结果:

'=SUM(A1,A11:A212,A12:A56,A342:A12,A3)'
  VALID

'=SUM(A11:A12,A12:a12,A34:A3)'
  VALID

'=SUM(A1,A2,A3)'
  VALID

'=SUM(A1)'
  VALID

'=SUM(A11:A212:A2,A12:A56,A4,A342:A12)'
  NOT VALID
于 2013-02-20T21:43:34.573 回答