1

我需要一个正则表达式来匹配一个没有被另一个不同的特定字符串包围的字符串。例如,在以下情况下,它会将内容分成两组:1)第二个 {Switch} 之前的内容和 2)第二个 {Switch} 之后的内容。它与第一个 {Switch} 不匹配,因为它包含在 {my_string} 中。该字符串将始终如下所示(即{my_string}此处的任何内容{/my_string})

Some more  
  {my_string}
  Random content
  {Switch} //This {Switch} may or may not be here, but should be ignored if it is present
  More random content
  {/my_string}
Content here too
{Switch}
More content

到目前为止,我已经得到了下面的内容,我知道这根本不是很接近:

(.*?)\{Switch\}(.*?)

我只是不确定如何将 [^] (不是运算符)与特定字符串与不同字符一起使用。

4

5 回答 5

2

看起来你真的在尝试使用正则表达式来解析语法——这是正则表达式真的不擅长做的事情。您最好编写一个解析器来将您的字符串分解为构建它的标记,然后处理该树。

也许像http://drupal.org/project/grammar_parser这样的东西可能会有所帮助。

于 2012-04-09T21:21:34.033 回答
1

您可以尝试积极的前瞻和后瞻断言 (http://www.regular-expressions.info/lookaround.html)

它可能看起来像这样:

$content = 'string of text before some random content switch text some more random content string of text after';
$before  = preg_quote('String of text before');
$switch  = preg_quote('switch text');
$after   = preg_quote('string of text after');
if( preg_match('/(?<=' $before .')(.*)(?:' $switch .')?(.*)(?=' $after .')/', $content, $matches) ) {
    // $matches[1] == ' some random content '
    // $matches[2] == ' some more random content '
}
于 2012-04-09T21:39:41.390 回答
1
$regex = (?:(?!\{my_string\})(.*?))(\{Switch\})(?:(.*?)(?!\{my_string\}));
/* if "my_string" and "Switch" aren't wrapped by "{" and "}" just remove "\{" and "\}" */
$yourNewString = preg_replace($regex,"$1",$yourOriginalString);

这可能会奏效。无法测试它知道,但我会稍后更新!如果这是您要查找的内容,我不知道,但是要否定多个字符,正则表达式语法为:

(?!yourString) 

它被称为“负前瞻断言”。

/编辑:

这应该有效并返回 true:

$stringMatchesYourRulesBoolean = preg_match('~(.*?)('.$my_string.')(.*?)(?<!'.$my_string.') ?('.$switch.') ?(?!'.$my_string.')(.*?)('.$my_string.')(.*?)~',$yourString);
于 2012-04-09T21:31:14.737 回答
1

试试这个简单的功能:

函数查找内容()

function find_content($doc) {
  $temp = $doc;
  preg_match_all('~{my_string}.*?{/my_string}~is', $temp, $x);
  $i = 0;
  while (isset($x[0][$i])) {
    $temp = str_replace($x[0][$i], "{REPL:$i}", $temp);
    $i++;
    }
  $res = explode('{Switch}', $temp);
  foreach ($res as &$part) 
    foreach($x[0] as $id=>$content)
      $part = str_replace("{REPL:$id}", $content, $part);
  return $res;
  }

以这种方式使用它

$content_parts = find_content($doc); // $doc is your input document
print_r($content_parts);

输出(你的例子)

Array
(
    [0] => Some more
{my_string}
Random content
{Switch} //This {Switch} may or may not be here, but should be ignored if it is present
More random content
{/my_string}
Content here too

    [1] => 
More content
)
于 2012-04-09T22:14:34.227 回答
0

看看PHP PEG。它是一个用 PHP 编写的小解析器。您可以编写自己的语法并对其进行解析。在您的情况下,这将非常简单。

语法语法和解析方式都在README.md中说明

自述文件的摘录:

  token*  - Token is optionally repeated
  token+ - Token is repeated at least one
  token? - Token is optionally present

令牌可能是:

 - bare-words, which are recursive matchers - references to token rules defined elsewhere in the grammar,
 - literals, surrounded by `"` or `'` quote pairs. No escaping support is provided in literals.
 - regexs, surrounded by `/` pairs.
 - expressions - single words (match \w+)

示例语法:(文件 EqualRepeat.peg.inc)

class EqualRepeat extends Packrat {
/* Any number of a followed by the same number of b and the same number of c characters
 * aabbcc - good
 * aaabbbccc - good
 * aabbc - bad
 * aabbacc - bad
 */

/*Parser:Grammar1
A: "a" A? "b"
B: "b" B? "c"
T: !"b"
X: &(A !"b") "a"+ B !("a" | "b" | "c")
*/
}
于 2012-04-09T21:46:44.760 回答