-5

I am trying to understand the regular expression language, but it is so difficult.

I have read some tutorials but i don't really get it.

I got this regex,

N\b*:\b*[^:]*

can someone please tell me what this regular expression means?

Thank you very much !

4

3 回答 3

2

Breaking the regex down to its components we have:

  • N
  • \b*
  • :
  • \b*
  • [^:]*

N and : are just literals. Nothing to say about those.

\b is the word boundary pseudo class. It will match either the beginning of the string, its end, or a word boundary. A word boundary is the beginning or end of a word. This is a bit weird because it matches ("consumes") no characters. In the string "foo bar" there are 4 word boundaries (before f, after o, before b, after r).

A * means that the previous match can be repeated any number of times (0, 1, 2 or more). This means you're accepting any number of consecutive word boundaries.

Finally the brackets [ ] define a class. Inside this class there is ^:. The ^ means "inverse". For example if you have a class [a] it will match the character a. But [^a] will match everything except a. So the class [^:] will match everything except :. Finally we have a * again meaning you can match this class any number of times.

So putting everything together here's what the regex means:

  • match the letter N
  • match any number of word boundaries
  • match the character :
  • match any number of word boundaries
  • match any number of characters except :.

Here are a few examples:

  • N: - matches, it's the simplest match
  • N - doesn't match, there's no :
  • N:foobar - matches
  • N:foobar:baz - doesn't match, the second : is not allowed.

This whole word boundaries business is not very intuitive and it isn't clear without context what is meant here. Matching word boundaries around the : doesn't make much sense. But at least you should be able to understand the regex better already.

于 2013-07-10T09:43:58.440 回答
1

I'd recommend you using debuggex.com

N\b*:\b*[^:]*

Regular expression image

Where

  • N, : are literals
  • \b represents a word start/end

Edit live on Debuggex

Hint: After getting a tiny bit familiar with the basics, I'd say:

Let's play a game:

Put your left hand finger on the black dot, your right finger on the first character of the string to be matched and try to reach the white point with your left finger.

The rules are:

  • you can only pass a through a rectangle if the character you are currently is matched by it
  • once you go through a rectangle you have to advance your right hand finger by one character
  • you are not allowed to go backwards (neither hands)
    • exceptions are the loops (the ones below the line joining the black & white dots)
  • if you reach the white dot you have a match
于 2013-07-10T10:09:35.637 回答
0

Some context may be useful for a more specific description of its function generally, but breaking this particular regular expression down:

N - The letter 'N'.
\b* - Zero or more word boundaries (that is, it matches the end of a word.)
: - A colon.
\b* - Zero or more word boundaries again.
[^:]* - A series of characters until either the end of the line, or a : is reached.

In the string

LMN:  Testing:123

this would match

N:  Testing
于 2013-07-10T09:37:09.880 回答