python - 如何在python中从网络获取html输入值

Question

当我访问时，www.sampleweb.com/reg/我有一个输入值，例如 .

<input id="input-id" class="input-class" name="myinput" type="text" value="hello world">

如何使用 python 获取输入的hello world值www.sampleweb.com/reg/？

我认为访问 www.sampleweb.com/reg/是这样的：

url = 'http://www.sampleweb.com/reg/'
urlopen(url)

这在访问网址时是否正确？

任何人都可以帮助我处理我的情况吗？

提前致谢 ...

score 1 · Accepted Answer

您应该在使用任何 python html 解析器通过 urllib（如您所提到的）gwetting 之后解析 html。例如，使用 BeautifulSoup：http : //www.crummy.com/software/BeautifulSoup/bs3/documentation.html#find%28name,%20attrs,%20recursive,%20text,%20 **kwargs%29

在你的情况下是这样的：

soup = BeautifulSoup(html)
inputs=soup.find("input", {"id": "input-id"})
print inputs[0]['value']

score 1 · Accepted Answer

1

您可以使用名为BeautifulSoup的库

于 2012-06-01T06:37:27.647 回答

score 0 · Accepted Answer

请注意，使用DOM 解析器是解析任何资源的 HTML 的最佳选择。

但是，如果“hello world”是您唯一想要的 HTML，那么快速而肮脏的方法将是：

toFind = '<input id="input-id" class="input-class" name="myinput" type="text" value="'
htmlStr = urllib.urlopen('yoururl.com/your/path').read()
value = htmlStr[htmlStr.index(toFind)+len(toFind):]
value = htmlStr[:htmlStr.index('\"')]
print value

python - 如何在python中从网络获取html输入值

3 回答 3

Related

Reference