4

我想匹配文件中的所有行,要么以 15 位开头,0D要么Characters只有 15 位数字。我怎样才能做到这一点

p_number = re.compile(r'(\d{15})')
f=open(infile)
for l in f:
  aa=re.findall(p_number,l)
  if aa > 0:
     print aa
f.close() 

EDIT

如果只有模式在行首。

4

3 回答 3

7

要仅在行首查找匹配项,请使用re.match. 0D如果存在前缀,则此正则表达式匹配所有非空白字符;如果您想匹配更少的字符,请告诉我。

>>> p_number = re.compile(r'(0D[\S]{13}|\d{15})')
>>> for s in ['0Dfannawhoopowe foo', 
              'foo 012345678901234', 
              '012345678901234 foo']:
...     match = p_number.match(s)
...     if match:
...         print match.groups()
... 
('0Dfannawhoopowe',)
('012345678901234',)

要了解、 和之间的区别match,请参阅以下示例。searchfindall

findall(自然)找到所有匹配项:

>>> for s in ['0Dfannawhoopowe foo', 
              'foo 012345678901234', 
              '012345678901234 foo']:
...     match = p_number.findall(s)
...     if match:
...         print match
... 
['0Dfannawhoopowe']
['012345678901234']
['012345678901234']

search在字符串中的任意位置找到该字符串的出现,而不仅仅是在开头。

>>> for s in ['0Dfannawhoopowe foo', 
              'foo 012345678901234', 
              '012345678901234 foo']:
...     match = p_number.search(s)
...     if match:
...         print match.groups()
... 
('0Dfannawhoopowe',)
('012345678901234',)
('012345678901234',)
于 2012-07-03T12:15:04.863 回答
5
import re
with open(infile) as f:
 print re.findall('^(0D.{15}|\d{15})$',f.read(),re.MULTILINE)
于 2012-07-03T12:29:19.690 回答
3

如果你想用正则表达式做到这一点,当然你可以做到:

with open(infile) as f:
  for l in f:
     if re.match('(0D)?[0-9]{15}', l):
       print l

但是你可以完全不用正则表达式来解决这个任务:

with open(infile) as f:
  for l in f:
     if (len(l) == 15 and l.is_digit()) or (l[:2]='0D' and len(l)==17 and l[2:].is_digit()):
       print l
于 2012-07-03T12:13:03.340 回答