2

我有一个(?<={% start %}).*?(?={% end %})匹配两个自定义标签之间所有内容的正则表达式。

问题是如果标签内有空格(例如,“{% start %}”)并且我添加\s+?了条件,则正则表达式会失败。以下代码不起作用:(?<={%\s+?start\s+?%}).*?(?={%\s+?end\s+?%})我在 PHP 中遇到错误:

preg_match_all(): Compilation failed: lookbehind assertion is not fixed length at offset 25

如果我删除 lookahead/lookbehind: ,则相同的正则表达式有效({%\s+?(start|end)\s+%})

请指教。

4

1 回答 1

3

描述

试试这个永久链接

[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]

这将匹配您{%%}括号内的所有文本,并在将值放入其组之前自动修剪文本。

第 0 组获取整个匹配字符串

  1. 获取开始标签文本
  2. 获取内部文本
  3. 获取结束标签文本

在此处输入图像描述

免责声明

这可能会有一些边缘情况,如果您将复杂的数据嵌套到 sub 中,正则表达式将失败,如果是这样,那么使用正则表达式可能不是此任务的最佳工具。

概括

[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]
Char class [{] matches one of the following chars: {
% Literal `%`
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
1st Capturing group ([^}]*start[^}]*) 
Negated char class [^}] infinite to 0 times matches any char except: }
start Literal `start`
Negated char class [^}] infinite to 0 times matches any char except: }
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
% Literal `%`
Char class [}] matches one of the following chars: }
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
2nd Capturing group (.*?) 
. 0 to infinite times [lazy] Any character (except newline) 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
\s 0 to infinite times [lazy] Whitespace [\t \r\n\f] 
Char class [{] matches one of the following chars: {
% Literal `%`
\s infinite to 0 times Whitespace [\t \r\n\f] 
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
3rd Capturing group ([^}]*end[^}]*) 
Negated char class [^}] infinite to 0 times matches any char except: }
end Literal `end`
Negated char class [^}] infinite to 0 times matches any char except: }
\b Word boundary: match in between (^\w|\w$|\W\w|\w\W)
\s infinite to 0 times Whitespace [\t \r\n\f] 
% Literal `%`
Char class [}] matches one of the following chars: }

PHP 示例

带有示例文本 {% start %} this is a sample text 1 {% end %}{% start %} this is a sample text 2 {% end %}

<?php
$sourcestring="your source string";
preg_match_all('/[{]%\s*?\b([^}]*start[^}]*)\b\s*?%[}]\s*?\b(.*?)\b\s*?[{]%\s*\b([^}]*end[^}]*)\b\s*%[}]/i',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>

$matches Array:
(
    [0] => Array
        (
            [0] => {% start %} this is a sample text 1 {% end %}
            [1] => {% start %} this is a sample text 2 {% end %}
        )

    [1] => Array
        (
            [0] => start
            [1] => start
        )

    [2] => Array
        (
            [0] => this is a sample text 1
            [1] => this is a sample text 2
        )

    [3] => Array
        (
            [0] => end
            [1] => end
        )

)
于 2013-05-21T13:14:52.343 回答