0

在我进一步操作它们之前,我正在尝试检查一些 URL 以查看它们是否返回正常,我在 self.myList 中有一个 URL 列表,然后通过 httplib HTTP 连接运行这些 URL 以获取响应,但是我从 cmd 中的 httplib 获取大量错误。

代码有效,因为我已经用下面的代码进行了测试,它正确地返回并在 wx.TextCtrl 中设置值:

#for line in self.myList:
            conn = httplib.HTTPConnection("www.google.com")
            conn.request("HEAD", "/")
            r1 = conn.getresponse()
            r1 = r1.status, r1.reason
            self.urlFld.SetValue(str(r1))

当我从 myList 传递超过 1 个 URL 时,它似乎不起作用。

for line in self.myList:
            conn = httplib.HTTPConnection(line)
            conn.request("HEAD", "/")
            r1 = conn.getresponse()
            r1 = r1.status, r1.reason
            self.urlFld.SetValue(line + "\t\t" + str(r1))

我在 cmd 上遇到的错误是

Traceback (most recent call last):
File "gui_texteditor_men.py", line 96, in checkBtnClick
conn.request("HEAD", "/")
File "C:\Python27\lib\httplib.py", line 958, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 992, in _send_request
self.endheaders(body)
File "C:\Python27\lib\httplib.py", line 954, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 814, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 776, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 757, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno 11004] getaddrinfo failed

使用 urlparse编辑、更新代码。我已经导入了 urlparse。

for line in self.myList:
            url = urlparse.urlparse(line)
            conn = httplib.HTTPConnection(url.hostname)
            conn.request("HEAD", url.path)
            r1 = conn.getresponse()
            r1 = r1.status, r1.reason
            self.urlFld.AppendText(url.hostname + "\t\t" + str(r1))

带回溯,

C:\Python27\Coding>python gui_texteditor_men.py
Traceback (most recent call last):
File "gui_texteditor_men.py", line 97, in checkBtnClick
conn = httplib.HTTPConnection(url.hostname)
File "C:\Python27\lib\httplib.py", line 693, in __init__
self._set_hostport(host, port)
File "C:\Python27\lib\httplib.py", line 712, in _set_hostport
i = host.rfind(':')
AttributeError: 'NoneType' object has no attribute 'rfind'

我现在在 .txt 文件中有 www.google.com 和 www.bing.com,当它抛出此错误时。

编辑 2 @Aya,

由于 2 个 URL 之间的“\n”,它似乎失败了。我以为我对其进行了编码以使用 .strip() 删除“\ n”,但似乎没有任何效果。

Failed on u'http://www.google.com\nhttp://www.bing.com'
Traceback (most recent call last):
File "gui_texteditor_men.py", line 99, in checkBtnClick
conn.request("HEAD", url.path)
File "C:\Python27\lib\httplib.py", line 958, in request
self._send_request(method, url, body, headers)
File "C:\Python27\lib\httplib.py", line 992, in _send_request
self.endheaders(body)
File "C:\Python27\lib\httplib.py", line 954, in endheaders
self._send_output(message_body)
File "C:\Python27\lib\httplib.py", line 814, in _send_output
self.send(msg)
File "C:\Python27\lib\httplib.py", line 776, in send
self.connect()
File "C:\Python27\lib\httplib.py", line 757, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
socket.gaierror: [Errno 11004] getaddrinfo failed

当我打开文件时,我又看了看我的 .strip(),

if dlg.ShowModal() == wx.ID_OK:
        directory, filename = dlg.GetDirectory(), dlg.GetFilename()
        self.filePath = '/'.join((directory, filename))
        self.fileTxt.SetValue(self.filePath)
        self.urlFld.LoadFile(self.filePath)
        self.myList = self.urlFld.GetValue().strip()

现在它使用“Failed on u'h'”回溯错误

谢谢

4

1 回答 1

1

如果self.myList包含 URL 列表,则不能HTTPConnection像在此处那样直接在构造函数中使用它们...

for line in self.myList:
    conn = httplib.HTTPConnection(line)
    conn.request("HEAD", "/")

构造HTTPConnection函数应该只传递 URL 的主机名部分,而请求方法应该传递路径部分。您需要使用类似...的内容解析 URL

import urlparse

for line in self.myList:
    url = urlparse.urlparse(line)
    conn = httplib.HTTPConnection(url.hostname)
    conn.request("HEAD", url.path)

更新

能不能把代码改成...

for line in self.myList:
    try:
        url = urlparse.urlparse(line)
        conn = httplib.HTTPConnection(url.hostname)
        conn.request("HEAD", url.path)
        r1 = conn.getresponse()
        r1 = r1.status, r1.reason
        self.urlFld.AppendText(url.hostname + "\t\t" + str(r1))
    except:
        print 'Failed on %r' % line
        raise

...并包括运行它的全部输出?

更新#2

我不太确定应该做什么self.fileTxtself.urlFld应该做什么,但是如果您只是从 中读取行self.filePath,则只需要...

if dlg.ShowModal() == wx.ID_OK:
    directory, filename = dlg.GetDirectory(), dlg.GetFilename()
    self.filePath = '/'.join((directory, filename))
    self.myList = [line.strip() for line in open(self.filePath, 'r').readlines()]
于 2013-04-16T16:34:53.517 回答