If I have a String which is delimited by a character, let's say this:
a-b-c
and I want to keep the delimiters, I can use look-behind and look-ahead to keep the delimiters themselves, like:
string.split("((?<=-)|(?=-))");
which results in
a
-
b
-
c
Now, if one of the delimiters is escaped, like this:
a-b\-c
And I want to honor the escape, I figured out to use a regex like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\-))))
ergo
string.split("((?<=-(?!(?<=\\\\-)))|(?=-(?!(?<=\\\\-))))"):
Now, this works and results in:
a
-
b\-c
(The backslash I'd later remove with string.replace("\\", "");
, I haven't found a way to include that in the regex)
My Problem is one of understanding.
The way I understood it, the regex would be, in words,
split ((if '-' is before (unless ('\-' is before))) or (if '-' is after (unless ('\-' is before))))
Why shouldn't the last part be "unless \
is before"? If '-' is after, that means we're between '\' and '-', so only \
should be before, not \\-
, but it doesn't work if I change the regex to reflect that like this:
((?<=-(?!(?<=\\-))) | (?=-(?!(?<=\\))))
Result: a
, -
, b\
, -c
What is the reason for this? Where is my error in reasoning?