1

I want to get all my that's inside . I wrote this code:

matchObj = re.search(r'<tr>(.*?)</tr>', txt, re.M|re.I|re.S)

but I only get the first group.

how can I get all groups?

Thanks in advance :)


I think I know your problem. Look at it this way: The line isn't drawn until you stroke the path. So whatever pattern is in effect when you call CGContextAddLineToPoint (for example) doesn't matter. When you call CGContextAddLineToPoint you're not drawing a line, you are simply building a path. I'm guessing that your subroutine for drawing the axes does not stroke them. The axes don't get drawn until you later call CGContextStrokePath, at which point the dashed pattern is in effect.

4

2 回答 2

9

findall

matchObj = re.findall(r'<tr>(.*?)</tr>', txt, re.M|re.I|re.S)

search只找到给定字符串中的第一个。

您可以阅读更多关于您可以在regex中使用的不同方法的信息。

但是,看起来您正在解析 HTML。为什么不使用HTML 解析器

于 2012-12-11T15:49:49.160 回答
4

要获得不止一场比赛,请使用re.findall().

然而,使用正则表达式解析 HTML 会很快变得丑陋和复杂。请改用适当的 HTML 解析器。

Python有几个可供选择:

元素树示例:

from xml.etree import ElementTree

tree = ElementTree.parse('filename.html')
for elem in tree.findall('tr'):
    print ElementTree.tostring(elem)

BeautifulSoup 示例:

from bs4 import BeautifulSoup

soup = BeautifulSoup(open('filename.html'))
for row in soup.select('table tr'):
    print row
于 2012-12-11T15:49:51.640 回答