python - 可以使用 urllib2 限制返回的行数来打开 URL 吗？

Question

我知道这听起来可能很荒谬，但是否可以urllib2用来打开一个 URL，以便只返回一组数量的行？

原因是为了减少加载时间，尤其是对于我正在使用的非常大的页面。例如，这是我的页面：

1. <html>
2.  <head>
3.   <title>Hello!</title>
4.  </head>
5.  <body>
6.   <p>Hi there.</p>
7.  </body>
8. </html>

假设我要打开我的页面到第 5 行，然后在阅读后打印它，它会给我：

1. <html>
2.  <head>
3.   <title>Hello!</title>
4.  </head>
5.  <body>

这可能吗？

score 3 · Accepted Answer

当然是，你可以使用readline()而不是read()

import urllib2

req = urllib2.Request('http://www.python.org')
response = urllib2.urlopen(req)

lines = ""
for x in range(10):
        lines += response.readline()

print(lines)

score 0 · Accepted Answer

0

单线：

from itertools import islice

list(islice(urlopen("http://www.python.org"), 5))

于 2012-06-11T09:17:34.903 回答

score 0 · Accepted Answer

您只需要设置阈值并跳出 readlines 循环。

import urllib2

req = urllib2.Request('http://www.python.org')
response = urllib2.urlopen(req)

read_until = 5    

lines = []
for line_number, line in enumerate(response.readlines()):
    if line_number >= read_until:
        break
    else:
        lines.append(line)

python - 可以使用 urllib2 限制返回的行数来打开 URL 吗？

3 回答 3

Related

Reference