python - python寻找特定的字符串模式

Question

我想在python中解析一个关闭格式的字符串

“JXE 2000 This is a bug to fix blah” 或格式

“JXE-2000: This is a bug to fix blah” 并检查字符串是否包含 JXE 和数字。

在上面的例子中，我需要检查字符串是否有 JXE 和 2000。我是 python 新手。

我尝试了以下方法：

textpattern="JXE-5000: This is bug "
text=re.compile("^([A-Z][0-9]+)*$")

text=re.search("JXE (.*)", textpattern)

print (text.groups())

我似乎只得到“5000 这是一个错误”。

score 1 · Accepted Answer

作为另一种选择，您可以允许 JXE 和 2000 之间的任何字符：

>>> text=re.compile("(JXE).*(2000(.*))")
>>> textpattern="JXE-2000: This is bug "
>>> text.search(textpattern).group(1,2) # or .group(1,2,3) if you want the bug as well
('JXE', '2000')

您text=re.compile("^([A-Z][0-9]+)*$")将搜索任何（ascii）大写字母后跟任何数字或数字的组，该组出现零次或多次。re.compile 用于编译你所追求的模式，这样你以后就不需要在脚本中指出它，这样你的代码会更快。如果您选择使用 re.compile（并且您实际上不需要在这里），您需要指出您正在寻找的模式（在这种情况下，'JXE' 后跟 '2000'）。如果您使用 re.compile，您将以这种格式搜索此模式：compiled_pattern.search(string)，对您来说是text.search(textpattern).

score 0 · Accepted Answer

取决于您要捕获的内容：

>>> s
['JXE 2000 This is a bug to fix blah',
 'JXE-2000: This is a bug to fix blah',
 'JXE-2000 Blah']
>>> re.search(r'JXE[-|\s+]\d+(.+)',s[0]).groups()
(' This is a bug to fix blah',)
>>> re.search(r'JXE[-|\s+]\d+(.+)',s[1]).groups()
(': This is a bug to fix blah',)
>>> re.search(r'JXE[-|\s+]\d+(.+)',s[2]).groups()
(' Blah',)

这是此模式匹配的内容：

JXE- 字符J，后跟X，然后是E
[-|\s+]- 一个破折号-或一个或多个空格
\d+- 一个或多个数字
(.+)- 一个或多个任意字符（换行符除外）

score 0 · Accepted Answer

您可以将 '-' 或 ' ' 与匹配[- ]：

>>> match = re.search("JXE[- ]2000[: ]+ (.*)", "JXE-2000: This is bug ")
>>> if match is not None:
    message = match.groups()[0]

>>> print message
This is bug

python - python寻找特定的字符串模式

3 回答 3

Related

Reference