1

我是python的新手,我已经开始学习一些正则表达式。我一直在尝试匹配字符串中的某些文本,但遇到了一些我不理解的东西。这是我的代码:

import re

pattern1 = r'\b\w+\b,\s\b\w+\b'
pattern2 = r'\b\w+\b,\s\b\w+\b,'

# pattern1 produces expected result
with open('test_sentence.txt', 'r') as input_f:
    for line in input_f:
        word = re.search(pattern1, line)
        print word.group()

# pattern 2, same as pattern1 but with additional ',' at the end
# does not work.
with open('test_sentence.txt', 'r') as input_f:
    for line in input_f:
        word = re.search(pattern2, line)
        print word.group()

下面是 test_sentence.txt 的内容:

I need to buy are bacon, cheese and eggs. 
I also need to buy milk, cheese, and bacon.
What's your favorite: milk, cheese or eggs.
What's my favorite: milk, bacon, or eggs.

我不明白为什么pattern2不起作用它none-type object has no attribute group在引用时会引发错误print word.group()。我相信这意味着它无法找到“pattern2”的正则表达式代码的匹配项。为什么最后的额外内容,会导致此问题?为什么它不简单地搭配milk, cheese,' and牛奶,培根,`?

4

1 回答 1

4

您正在搜索每一行而不是整个文件。这意味着有多行 pattern2 将不匹配并且将返回None导致错误的行。将第二行移到顶部,您将看到该行匹配并且稍后在第二行出现错误。

在使用它之前总是检查返回值:

word = re.search(pattern2, line)

if word:
    print word.group()
else:
    print "No match"
于 2013-09-21T07:26:22.107 回答