python - 为什么 .find() 在 python 3 中不能与 urllib.request.urlopen() 一起使用？

Question

尝试从 python 2 中的 urllib 转换到 python 3。我可以使用 .urlopen() 输出 html 源，但我无法使用 .find() 方法对其进行索引。

import urllib.request
fh = urllib.request.urlopen("http://stackoverflow.com")
html = fh.read()
fh.close()

print(html.find("<p>"))

我收到类型错误。我知道它正在返回一个字节数组，但我对它的实际含义很模糊。我已经尝试了一些这样的答案，这些答案都是死胡同。我的问题是：

在 python 3 中是否有一种简单的本地方法可以将 URL 的页面源作为字符串获取？

score 3 · Accepted Answer

使用html.decode('utf-8')（或任何编码）来获取str您可以使用的对象.find()。

.decode()用于获取一组平面字节并将它们（通过反转字符编码，例如 UTF-8）转换为一串实际代码点（可显示的符号）。

1 回答 1