0

I am having problems understanding the following regex:

regexp="(?P<date>\d{4}-\d{2}-\d{2}-\d{2}:\d{2}:\d{2})\S+\s(?P<proto>\w+)\S+\s(?P<sid>\S)\s+(? P<sip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(\s+(?P<sport>\d+))?\s+(?P<dip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(:)?\s(?P<dport>\d+)((:)?\s+(?P<info>\S+\s\S+)\s+\[(?P<comment>.*)\])?"

0 or 1
date={normalize_date($date)}
plugin_sid={translate($sid)}
src_ip={$sip}
src_port={$sport}
dst_ip={$dip}
dst_port={$dport}
protocol={$proto}
userdata1={$info}
userdata2={$comment}

What does the ?P stand for? Could someone help me make sense of this monster by spelling out the logic?

4

1 回答 1

4

(?P...) is a named group.

Oh, and (? P<sip> might be invalid (I don't think a space is allowed there).

If you have any other questions, this is a useful resource for explaining regex, although it doesn't work for (?P...).

Explanation of your regex without the named groups (so just replace "group and capture to \1" with "group and capture to 'date'" for the first one, and so on) (link):

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \d{4}                    digits (0-9) (4 times)
--------------------------------------------------------------------------------
    -                        '-'
--------------------------------------------------------------------------------
    \d{2}                    digits (0-9) (2 times)
--------------------------------------------------------------------------------
    -                        '-'
--------------------------------------------------------------------------------
    \d{2}                    digits (0-9) (2 times)
--------------------------------------------------------------------------------
    -                        '-'
--------------------------------------------------------------------------------
    \d{2}                    digits (0-9) (2 times)
--------------------------------------------------------------------------------
    :                        ':'
--------------------------------------------------------------------------------
    \d{2}                    digits (0-9) (2 times)
--------------------------------------------------------------------------------
    :                        ':'
--------------------------------------------------------------------------------
    \d{2}                    digits (0-9) (2 times)
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \S+                      non-whitespace (all but \n, \r, \t, \f,
                           and " ") (1 or more times (matching the
                           most amount possible))
--------------------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \w+                      word characters (a-z, A-Z, 0-9, _) (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \S+                      non-whitespace (all but \n, \r, \t, \f,
                           and " ") (1 or more times (matching the
                           most amount possible))
--------------------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  (                        group and capture to \3:
--------------------------------------------------------------------------------
    \S                       non-whitespace (all but \n, \r, \t, \f,
                             and " ")
--------------------------------------------------------------------------------
  )                        end of \3
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \4:
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \4
--------------------------------------------------------------------------------
  (                        group and capture to \5 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    (                        group and capture to \6:
--------------------------------------------------------------------------------
      \d+                      digits (0-9) (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
    )                        end of \6
--------------------------------------------------------------------------------
  )?                       end of \5 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \5)
--------------------------------------------------------------------------------
  \s+                      whitespace (\n, \r, \t, \f, and " ") (1 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (                        group and capture to \7:
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
    \d{1,3}                  digits (0-9) (between 1 and 3 times
                             (matching the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \7
--------------------------------------------------------------------------------
  (                        group and capture to \8 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    :                        ':'
--------------------------------------------------------------------------------
  )?                       end of \8 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \8)
--------------------------------------------------------------------------------
  \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
  (                        group and capture to \9:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \9
--------------------------------------------------------------------------------
  (                        group and capture to \10 (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    (                        group and capture to \11 (optional
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      :                        ':'
--------------------------------------------------------------------------------
    )?                       end of \11 (NOTE: because you are using
                             a quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \11)
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    (                        group and capture to \12:
--------------------------------------------------------------------------------
      \S+                      non-whitespace (all but \n, \r, \t,
                               \f, and " ") (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
      \s                       whitespace (\n, \r, \t, \f, and " ")
--------------------------------------------------------------------------------
      \S+                      non-whitespace (all but \n, \r, \t,
                               \f, and " ") (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
    )                        end of \12
--------------------------------------------------------------------------------
    \s+                      whitespace (\n, \r, \t, \f, and " ") (1
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    \[                       '['
--------------------------------------------------------------------------------
    (                        group and capture to \13:
--------------------------------------------------------------------------------
      .*                       any character except \n (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
    )                        end of \13
--------------------------------------------------------------------------------
    \]                       ']'
--------------------------------------------------------------------------------
  )?                       end of \10 (NOTE: because you are using a
                           quantifier on this capture, only the LAST
                           repetition of the captured pattern will be
                           stored in \10)
于 2013-09-25T00:45:05.000 回答