0

请您解释一下,我怎样才能制作匹配(arg1), (arg1, arg2),(arg1, arg2, xarg, zarg)等的正则表达式。每个名称都是一个始终以 symbol 开头的 ASCII 字符串[A-Za-z]。这是我尝试过的"("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")":谢谢!

注意:正则表达式必须在flex

4

2 回答 2

1

类似的东西?

>>> import re
>>> s = '''Could you explain, please, how can I make regex that will match (arg1), (arg1, arg2), (arg1, arg2, xarg, zarg), etc. Every name is an ASCII string which always starts with symbol [A-Za-z]. Here is what I've tried: "("[A-Za-z][a-z0-9]*(,)?([A-Za-z][a-z0-9]*)?")". Thanks!'''
>>> re.findall(r'\([A-Za-z]?arg[0-9]?(?:, [A-Za-z]?arg[0-9]?)*\)', s)
['(arg1)', '(arg1, arg2)', '(arg1, arg2, xarg, zarg)']
于 2012-10-21T23:39:34.523 回答
1

我不确定 flex 是否是正确的工具,因为您通常会使用它将这样的输入分隔成单独的标记。但是,这当然是可能的:

"("[[:alpha:]][[:alnum:]]*(,[[:alpha:]][[:alnum:]]*)*")"

那会匹配(arg1) (arg1,arg2),但不会匹配( arg1 )or (arg1, arg2)。如果你想忽略所有地方的空格,它会变得有点冗长。

如果您使用 lex 定义,这类事情的可读性会更高:

ID      [[:alpha:]][[:alnum:]]*

%%

"("{ID}(","{ID})*")"

或者,使用空间匹配:

/* Make sure you're in the C locale when you compile. Or adjust
 * the definition accordingly. Perhaps you wanted to allow other 
 * characters in IDs.
 */
ID      [[:alpha:]][[:alnum:]]*
/* OWS = Optional White Space.*/
/* Flex defines blank as "space or tab" */
OWS     [[:blank:]]*
COMMA   {OWS}","{OWS}
OPEN    "("{OWS}
CLOSE   {OWS}")"

%%

{OPEN}{ID}({COMMA}{ID})*{CLOSE}  { /* Got a parenthesized list of ids */

最后说明:这也不匹配();必须至少有一个 id。如果您也想包含它,您可以将括号之间的部分设为可选:

{OPEN}({ID}({COMMA}{ID})*)?{CLOSE}  { /* Got a parenthesized        */
                                      /* possibly empty list of ids */
于 2012-10-22T03:00:17.803 回答