python - 正则表达式好像字符串是一个变量

Question

我正在使用一个脚本来确定我的字符串是否是一个有效的变量。这是非常基本的，但我似乎无法弄清楚如何使用正则表达式。

所以基本上我想要：

A-Z
a-z
0-9
no whitespace anywhere
no special char except _

那可能吗？这是我尝试过的：

re.match("[a-zA-Z0-9_,/S]*$", char_s):

score 4 · Accepted Answer

像这样的模式应该有效：

^[a-zA-Z_][a-zA-Z0-9_]*$

或者更简单地说：

^(?!\d)\w+$

在这两种情况下，它将匹配一个由一个或多个字母、数字或下划线组成的字符串，只要它不以数字开头。

第二种(?!…)模式是否定的前瞻断言。它确保第一个字符不是数字。更多信息可以在手册中找到。

score 3 · Accepted Answer

除了提到的正则表达式之外，您还需要确保它不是保留关键字之一：

and       del       from      not       while    
as        elif      global    or        with     
assert    else      if        pass      yield    
break     except    import    print              
class     exec      in        raise              
continue  finally   is        return             
def       for       lambda    try

所以是这样的：

reserved = ["and", "del", "from", "not", "while", "as", "elif", "global", "or", "with", "assert", "else", "if", "pass", "yield", "break", "except", "import", "print", "class", "exec", "in", "raise", "continue", "finally", "is", "return", "def", "for", "lambda", "try"]

def is_valid(keyword):
    return (keyword not in reserved and
            re.match(r"^(?!\d)\w+$", keyword) # from p.s.w.g answer

或者像@nofinator 建议的那样，您可以而且应该只使用keyword.iskeyword().

score 1 · Accepted Answer

re.match(r"^[^\W\d]\w*$", char_s):

单词\w字符类等价于[a-zA-Z0-9_]. 标识符不能以数字开头，因此匹配[^\W\d]第一个字符和\w*其余字符。

score 1 · Accepted Answer

正确的方法：

蟒蛇2

import re
import keyword
import tokenize

re.match(tokenize.Name+"$", char_s) and not keyword.iskeyword(char_s)

蟒蛇 3

import keyword

char_s.isidentifier() and not keyword.iskeyword(char_s)

请注意，Python 2 的方法在 Python 3 上静默失败。

当您看到这类问题时，您首先应该问的是“ Python 是如何做到的？ ”因为几乎所有时间它都向用户公开了一个方法。

python - 正则表达式好像字符串是一个变量

4 回答 4

Related

Reference