0

我正在尝试匹配文件中的文本

In [44]: with open(path) as f:
   ....:     for line in f:
   ....:         matched = re.search('^PARTITION BY HASH',line)
   ....:         if matched is not None:
   ....:             print matched.group()
   ....:

该文件包含像 PARTITION BY HASH(SOME_THING); 这样的行。还有其他一些行,其中有 SUBPARTITION BY HASH(SOME_THING) 不应该匹配

比赛结束后,我想删除那条线。但是打印matched.group失败了,为什么?

4

2 回答 2

1

像这样的东西:

In [29]: strs1="PARTITION BY HASH(SOME_THING)"

In [30]: strs2="SUBPARTITION BY HASH(SOME_THING)"

In [31]: bool(re.match(r"^PARTITION BY HASH",strs1))
Out[31]: True

In [32]: bool(re.match(r"^PARTITION BY HASH",strs2))
Out[32]: False
于 2012-11-06T10:32:07.330 回答
0

但是打印matched.group失败

好吧,它只是做了它应该做的事情:它返回匹配项。在这种情况下,因为

>>> import re
>>> line = "PARTITION BY HASH(something)"
>>> re.search('^PARTITION BY HASH', line).group()
'PARTITION BY HASH'

如果要打印以 开头的行'PARTITION BY HASH',请根据 Ashwini Chaudhary 的建议:

with open(path) as f:
    for line in f:
        if line.startswith('PARTITION BY HASH'):
            print line,

请注意逗号以防止打印插入额外的行尾字符。

如果你坚持使用包re

import re

with open(path) as f:
    for line in f:
        if re.match('PARTITION BY HASH', line):
            print line,

请注意re.match不使用起始位置指示符的用法(有关更多信息,^请参见http://docs.python.org/2/library/re.html#search-vs-match )

于 2012-11-06T10:28:30.180 回答