0

我想开始一个项目来学习python,我选择了写一个简单的网络代理。

在某些情况下,某些线程似乎收到了一个空请求,并且 python rasie 异常:

first_line:  GET http://racket-lang.org/ HTTP/1.1
Connect to: racket-lang.org 80
first_line:
Exception in thread Thread-2:
Traceback (most recent call last):
  File "C:\Python27\lib\threading.py", line 551, in __bootstrap_inner
    self.run()
  File "C:\Python27\lib\threading.py", line 504, in run
    self.__target(*self.__args, **self.__kwargs)
  File "fakespider.py", line 37, in proxy
    url = first_line.split(' ')[1]
IndexError: list index out of range

first_line: first_line:   GET http://racket-lang.org/plt.css HTTP/1.1GET http://racket-lang.org/more.css HTTP/1.1

Connect to:Connect to:  racket-lang.orgracket-lang.org  8080

我的代码很简单。我不知道发生了什么,任何帮助将不胜感激:)

from threading import Thread
from time import time, sleep
import socket
import sys

RECV_BUFFER = 8192
DEBUG = True

def recv_timeout(socks, timeout = 2):
    socks.setblocking(0);
    total_data = []
    data = ''
    begin = time()
    while True:
        if total_data and time() - begin > timeout:
            break
        elif time() - begin > timeout * 2:
            break
        try:
            data = socks.recv(RECV_BUFFER)
            if data:
                total_data.append(data)
                begin = time()
            else:
                sleep(0.1)
        except:
            pass
    return ''.join(total_data)

def proxy(conn, client_addr):
    request = recv_timeout(conn)

    first_line = request.split('\r\n')[0]
    if (DEBUG):
        print "first_line: ", first_line
    url = first_line.split(' ')[1]

    http_pos = url.find("://")
    if (http_pos ==  -1):
        temp = url
    else:
        temp = url[(http_pos + 3):]

    port_pos = temp.find(":")
    host_pos = temp.find("/")
    if host_pos == -1:
        host_pos = len(temp)

    host = ""
    if (port_pos == -1 or host_pos < port_pos):
        port = 80
        host = temp[:host_pos]
    else:
        port = int((temp[(port_pos + 1):])[:host_pos - port_pos - 1])
        host = temp[:port_pos]

    print "Connect to:", host, port

    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.connect((host, port))
        s.send(request)

        data = recv_timeout(s)
        if len(data) > 0:
            conn.send(data)
        s.close()
        conn.close()
    except socket.error, (value, message):
        if s:
            s.close()
        if conn:
            conn.close()
        print "Runtime error:", message
        sys.exit(1)



def main():
    if len(sys.argv) < 2:
        print "Usage: python fakespider.py <port>"
        return sys.stdout

    host = "" #blank for localhost
    port = int(sys.argv[1])

    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        s.bind((host, port))
        s.listen(50)

    except socket.error, (value, message):
        if s:
            s.close()
        print "Could not open socket:", message
        sys.exit(1)

    while 1:
        conn, client_addr = s.accept()
        t = Thread(target=proxy, args=(conn, client_addr))
        t.start()

    s.close()

if __name__ == "__main__":
    main()
4

1 回答 1

1

您看到的堆栈跟踪说明了一切:

url = first_line.split(' ')[1]
  IndexError: list index out of range

显然,拆分变量的结果first_line不是您假设的具有多个元素的列表。所以它包含的东西与你预期的不同。要查看它实际包含的内容,只需将其打印出来:

print first_line

或使用调试器。

于 2013-05-09T15:17:01.223 回答