2

我正在尝试从字符串中提取三个段。由于我对正则表达式不是特别擅长,我认为我所做的可能会做得更好。

我想提取以下字符串的粗体部分:

SOMETEXT:ANYTHING_HERE(旧= ANYTHING_HERE,新= ANYTHING_HERE

一些例子可能是:

ABC:Some_Field(旧=,新=123)

ABC:Some_Field(旧=ABCde,新=1234)

ABC:Some_Field(旧=Hello World,新=Bye Bye World)

所以上面会返回以下匹配:

$matches[0] = 'Some_Field';
$matches[1] = '';
$matches[2] = '123';

到目前为止,我有以下代码:

preg_match_all('/^([a-z]*\:(\s?)+)(.+)(\s?)+\(old=(.+)\,(\s?)+new=(.+)\)/i',$string,$matches);

上面的问题是它为字符串的每个单独段返回一个匹配项。如果有意义,我不知道如何使用正则表达式确保字符串是正确的格式而不捕获和存储匹配项?

所以,我的问题,如果还不清楚,我如何才能从上面的字符串中检索我想要的段?

4

5 回答 5

1

你不需要preg_match_all。您可以使用此preg_match调用:

$s = 'SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2)';
if (preg_match('/[^:]*:\s*(\w*)\s*\(Old=(\w*),\s*New=(\w*)/i', $s, $arr))
   print_r($arr);

输出:

Array
(
    [0] => SOMETEXT: ANYTHING_HERE (Old=ANYTHING_HERE1, New=ANYTHING_HERE2
    [1] => ANYTHING_HERE
    [2] => ANYTHING_HERE1
    [3] => ANYTHING_HERE2
)
于 2013-10-23T15:58:59.210 回答
1
if(preg_match_all('/([a-z]*)\:\s*.+\(Old=(.+),\s*New=(.+)\)/i',$string,$matches)) {
    print_r($matches);
}

例子:

$string = 'ABC: Some_Field (Old=Hello World,New=Bye Bye World)';

将匹配:

Array
(
    [0] => Array
        (
            [0] => ABC: Some_Field (Old=Hello World,New=Bye Bye World)
        )

    [1] => Array
        (
            [0] => ABC
        )

    [2] => Array
        (
            [0] => Hello World
        )

    [3] => Array
        (
            [0] => Bye Bye World
        )

)
于 2013-10-23T16:00:59.240 回答
1

问题是您使用的括号比您需要的多,因此捕获的输入段比您希望的要多。

例如,每个(\s?)+段应该只是\s*

您正在寻找的正则表达式是:

[^:]+:\s*(.+)\s*\(old=(.*)\s*,\s*new=(.*)\)

在 PHP 中:

preg_match_all('/[^:]+:\s*(.+)\s*\(old=(.*)\s*,\s*new=(.*)\)/i',$string,$matches);

可以在这里找到一个有用的工具:http ://www.myregextester.com/index.php

该工具提供了一个“解释”复选框(以及您要选择的“PHP”复选框和“i”标志复选框),它还提供了对正则表达式的完整解释。对于后人,我还包括以下解释:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?i-msx:                 group, but do not capture (case-insensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  [^:]+                    any character except: ':' (1 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  :                        ':'
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    .+                       any character except \n (1 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  \(                       '('
----------------------------------------------------------------------
  old=                     'old='
----------------------------------------------------------------------
  (                        group and capture to \2:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \2
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  ,                        ','
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  new=                     'new='
----------------------------------------------------------------------
  (                        group and capture to \3:
----------------------------------------------------------------------
    .*                       any character except \n (0 or more times
                             (matching the most amount possible))
----------------------------------------------------------------------
  )                        end of \3
----------------------------------------------------------------------
  \)                       ')'
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
于 2013-10-23T16:04:15.910 回答
1

像更简单的东西呢^_^

[:=]\s*([\w\s]*)

现场演示

于 2013-10-23T16:14:54.447 回答
0
:\s*([^(\s]+)\s*\(Old=([^,]*),New=([^)]*)

现场演示

也请告诉您是否需要解释。

于 2013-10-23T16:09:50.653 回答