5

我有一个从 .html 文件中提取一些文本的小脚本。

f = open(local_file,"r")
for line in f:
    searchphrase = '<span class="position'
    if searchphrase in line:
        print("found it\n")

这对我来说很好(稍后将导入错误处理),我的问题是我要提取的文本在搜索短语之后跟随 2 行。如何在 .html 文件中下移 2 行?

4

2 回答 2

12

您可以通过调用两次来推进f(这是一个可迭代的)两行next()

with open(local_file,"r") as f
    for line in f:
        searchphrase = '<span class="position'
        if searchphrase in line:
            print("found it\n")
            next(f) # skip 1 line
            return next(f)  # and return the line after that.

但是,如果您尝试解析 HTML,请考虑改用 HTML解析器。使用BeautifulSoup,例如。

于 2013-03-27T10:44:35.093 回答
0

这对我很有效:

f = open(local_file,"r")
found = -1
for line in f:
    if found == 2:
        print("Line: "+line);
        break
    elif found > 0:
        found += 1
    else:
        searchphrase = '<span class="position'
        if searchphrase in line:
            print("found it")
            found = 1

输入文件是:

bla
<span class="position">Hello</span>
blub
that's it
whatever

以及程序的输出:

found it
Line: that's it

除了打电话break,您还可以重置found为 -1 以搜索更多出现的模式......

于 2013-03-27T10:47:11.713 回答