5

寻找一个 perl 单行程序,它将找到具有下一个模式的所有单词:

X(not_X_chrs)X(not_X_chrs)X    e.g. cyclic

对于一个字符,这很容易,例如对于 'a'

perl -nle 'print if /^a[^a]+a[^a]+a$/' < /usr/share/dict/web2

但我想搜索任何字符,因此,寻找一个正则表达式来查找所有单词,例如:

azalea   #repeating a
baobab   #repeating b
cyclic   #c

等等..

试过这个:

perl -nle 'print if m/^([a-z])[^$1]+$1[^$1]+$1$/i' </usr/share/dict/web2

但不起作用。

4

4 回答 4

6
(?:(?!STRING).)

(?:STRING)

作为

[^CHAR]

CHAR

所以你可以使用

/
   ^
   (\pL)
   (?:
      (?:(?!\1).)+
      \1
   ){2}
   \z
/sx
于 2012-06-14T23:45:06.160 回答
3

这是我能想到的最好的正则表达式:

^([a-z])((?:(?!\1).)+\1){2}$

RegexPal上测试。

于 2012-06-14T23:46:09.617 回答
0

您还可以使用带有原子非回溯组的惰性量词:

^(\w)(?>\w*?\1){2}$

仅当可接受 0 个中间字符时才有效。

对于至少 1 个字符,您必须使用负前瞻:

^(\w)(?>(?!\1)\w+?\1){2}$
于 2012-06-15T01:14:49.130 回答
0

perlretut中,它说您可以在正则表达式(不是替换的正确部分)中使用\g1. 这在 5.14 中有所改变。因为我这里只有 5.12.2,所以我必须\1改用。

因此,您的原始正则表达式稍作调整对我有用:

use strict; use warnings;
use 5.12.2;
use feature qw(say);
for (qw/ azalea baobab cyclic deadend teeeeeestest doesnotwork /) {
  say if m/^([a-z])[^\1]+\1[^\1]+\1$/i;
}

YAPE::Regex::Explain看它

use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(qr/^([a-z])[^\1]+\1[^\1]+\1$/i)->explain();

产量:

The regular expression:

(?i-msx:^([a-z])[^\1]+\1[^\1]+\1$)

matches as follows:


use YAPE::Regex::Explain;
print YAPE::Regex::Explain->new(qr/^([a-z])[^\1]+\1[^\1]+\1$/i)->explain();

NODE                     EXPLANATION
----------------------------------------------------------------------
(?i-msx:                 group, but do not capture (case-insensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    [a-z]                    any character of: 'a' to 'z'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  [^\1]+                   any character except: '\1' (1 or more
                           times (matching the most amount possible))
----------------------------------------------------------------------
  \1                       what was matched by capture \1
----------------------------------------------------------------------
  [^\1]+                   any character except: '\1' (1 or more
                           times (matching the most amount possible))
----------------------------------------------------------------------
  \1                       what was matched by capture \1
----------------------------------------------------------------------
  $                        before an optional \n, and the end of the
                           string
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------

编辑:因此,您的单线是perl -e 'print if m/^([a-z])[^\1]+\1[^\1]+\1$/i'.

另一方面,如果您尝试过perl -w -e 'print if m/(as)$1/',您会立即看到您的问题:

$ perl -w -e 'print if m/(a)$1/' asdf
Use of uninitialized value $1 in regexp compilation at -e line 1.
Use of uninitialized value $_ in pattern match (m//) at -e line 1.

我没有弄清楚为什么它匹配ololololo

于 2012-06-15T07:28:33.147 回答