0

我想用 python shell 从 html 获取数据。这段代码很简单:

import re
s="<a><b>为什么你该去逛逛墓地&lt;/b></a>"
p=re.compile('(?<=<(\w+)>).*?(?=<(/\w+)>)',re.I|re.S)
m=p.findall(s)

error log:

    Traceback (most recent call last):
  File "C:/test1.py", line 3, in <module>
    p=re.compile('(?<=<(\w+)>).*?(?=<(/\w+)>)',re.I|re.S)
  File "C:\Python33\lib\re.py", line 214, in compile
    return _compile(pattern, flags)
  File "C:\Python33\lib\functools.py", line 258, in wrapper
    result = user_function(*args, **kwds)
  File "C:\Python33\lib\re.py", line 274, in _compile
    return sre_compile.compile(pattern, flags)
  File "C:\Python33\lib\sre_compile.py", line 497, in compile
    code = _code(p, flags)
  File "C:\Python33\lib\sre_compile.py", line 482, in _code
    _compile(code, p.data, flags)
  File "C:\Python33\lib\sre_compile.py", line 115, in _compile
    raise error("look-behind requires fixed-width pattern")

sre_constants.error:后视需要固定宽度的模式

4

1 回答 1

0

这会做你想要的吗?

re.compile('(?:<[^>]*>\s*)+(.+?)(?:<[^>]*>\s*)+',re.I|re.S)
于 2013-05-31T04:25:33.427 回答