0

我正在尝试通过循环检查所有链接是否有效,哪些链接无效并重定向到某个标准页面

import urllib2
import csv



i=18509
yyy = csv.writer(open('valid_links.csv', 'w'), delimiter=',',quotechar='"',lineterminator="\n")

while i!=0:
   print i
   url="http://investing.businessweek.com/research/stocks/private  /snapshot.asp?privcapId="+str(i)
   request = urllib2.Request(url)
   request.get_method = lambda : 'HEAD'
   response = urllib2.urlopen(request)
   it=response.info()

   #page = urllib2.urlopen(url,timeout=2).geturl()
   yyy.writerow([url,it['Content-Length']])
   i=i+1

我有超过200M的页面要检查,有没有更有效的方法?

4

0 回答 0