0

Why is the following RegEx only working, when removing ^ and $?

^(?<=.).+(?=.)$

Source: #Hello World#
Target: Hello World

Looking forward finding the solution.
Many thanks in advance.

4

3 回答 3

4

These lookarounds can never work in combination with the anchors.

^ asserts that there is the beginning of the string (but doesn't advance the position of the engine's "cursor"). Then (?<=.) asserts that there is any character left of that position. That's a contradiction in all cases (almost, see next paragraph). The same goes for (?=.) and $.

In multiline mode (m), ^ and $ could match at other points in the string, in particular, at the beginning and end of each line. In that case there would be other characters in front of or after those positions (line break characters). But line break characters cannot be matched by . (in most engines), unless you are also using the single line or "dotall" mode (s). So the only case in which your regex could ever match is when using both m and s.

What you are probably looking for is this:

(?<=^.).+(?=.$)

Which asserts that there is another character, but only one character (immediately surrounded by the ends of the string).

I should also make very clear why there is a difference between (?=.)$ and (?=.$). Lookarounds do not advance the position of the engine's "cursor". That means, in the case of (?=.)$, the engine checks that the current position is immediately followed by another character - and one that is satisfied and the lookahead is left, the engine is still at that same position (that's why it's called lookahead). Hence, you need to put the anchor into the lookarounds, so that they are actually checked before resetting the position of the "cursor".

More information on lookaround. (there is also a second part to this in the sidebar of that page)

于 2013-06-15T18:37:44.517 回答
1

^(?<=.).+(?=.)$ is like saying match a string that is

  • anchored at the beginning and end of string
  • preceded by a character (that is not part of the match) before the match
  • followed a character (that is not part of the match) after the end of the match

It's contradictory. If you want a string to be anchored at the beginning, there cannot be any character before (to the left of) it; vice versa for the anchor at the end.

于 2013-06-15T18:36:57.177 回答
0

Your regex pattern have no sense:

^   means the begining of the string
$   means the end of the string

(?=....) means followed by (lookahead)
(?<=....) preceded by (lookbehind)

first part : ^(?<=.) fails cause you haven't any characters before the lookbehind, just the begining of the string

last part : (?=.)$ same problem, you don't have any characters after the lookahead, thus it fails.

于 2013-06-15T18:40:19.733 回答