1

This is somewhat of a basic question as I'm not very familiar with RegEx but I just cannot find an answer online (it's probably that I don't know what to google for).

I want to write a function that finds all commas that are:

  • not between a "]" and a " [" (like in [abc], [def])
  • not followed by a "+" (like in abc,+def)

I figured out the individual regex for the two instances are

(?!\\])(\\,)(?!\\s\\[)

and

(\\,)(?!\\+)

(correct me if I'm wrong)

But how do I put the two together in only one command, so that my function will identify all commas that satisfy these two conditions? I'm having some difficulty wrapping my head around it also because they are two negative conditions. If it makes any difference, I'm using R.

Thank you!

4

2 回答 2

1

You may use a PCRE regex with base R regex functions:

][^[]*\[(*SKIP)(*F)|,(?!\+)

See the regex demo.

Details

  • ][^[]*\[(*SKIP)(*F) - match and skip ], then 0+ chars other than [ and then a [ (i.e. the not between a "]" and a " [" rule)
  • | - or match
  • ,(?!\+) - a comma that is not immediately followed with a literal + sign

R online demo:

x <- "[abc], [def] abc,+def abc,def"
reg <- "][^[]*\\[(*SKIP)(*F)|,(?!\\+)"
strsplit(x, reg, perl=TRUE)
## [[1]]
## [1] "[abc], [def] abc,+def abc" "def"
gsub(reg, "@", x, perl=TRUE)
## [1] "[abc], [def] abc,+def abc@def"
于 2019-03-02T20:53:03.713 回答
0

So you want commas that are:

  • Not preceded by a ]
  • Not followed by either a + or a [

For the second you can use negative lookahead ((?!)). For the first one you would want its opposite, which is negative lookbehind ((?<!)).

This should do it:

,(?<!\])(?!\+|\[)
于 2019-03-02T19:14:19.027 回答