我正在尝试编写一个正则表达式来确定 C 中宏声明的名称是否大写:
#define MY_MACRO
为了仅检测大写单词(不考虑退格或连字符等特殊字符),我使用以下正则表达式:
"#define +[^A-Z]+"
如果我的宏完全大写,它可以工作,但如果它像这样则失败:
#define Mymacro
什么是能够确定这种情况的正则表达式?
To detect #define MixedCase
but not match #define ALLUPPERCASE
you need a negative lookahead assertion:
r'#define\s+(?![A-Z_]+\b)[A-Za-z_]+\b'
\b
matches the word boundary; the place where a word ends, perhaps because of whitespace after it, or the end of the line.
The (?!..)
negative lookahead assertion checks that the next word is not all uppercase, before allowing a match on a mixed-case word.
Note that I've included the _
underscore as well in the matching character class.
You may want to include digits in your macro names, they are legal, after all:
r'#define\s+(?![A-Z0-9_]+\b)\w+\b'
The second character class can then be simplified to \w
, which is the same as [A-Za-z0-9_]
.
正则表达式是否必须做所有事情?您可以将所有#define
s 与正则表达式匹配,然后使用一些非常简单的 Python 代码测试宏名称的大写:
macro_defn = re.compile(r'#define\s+(\w+)')
for line in code_source:
macro_match = macro_defn.match(line)
if macro_match:
macro_name = macro.group(1)
if macro_name.upper() != macro_name:
print line