0

So I'm trying to use Python to automate 508 compliance checking. There're a few hundred pages on our site, and at the moment a person is actually going through the site every week and tries to enter all the URLs by hand. The UIUC link below checks the request for the referer header and then returns the evaluation of that site. I can't get the request to actually work. I've looked all through SO and can't find anything that helps. The code that is screwy is below and below that the error message.

def fae(urltofae):
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
#[('Referer': urltofae)]
r = opener.open('http://www.fae.cita.uiuc.edu/evaluate/link/')
print r
fae("http://www.example.com/")

And the Error:

  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in fae
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/urllib2.py", line 400, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/urllib2.py", line 418, in _open
    '_open', req)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/urllib2.py", line 378, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/urllib2.py", line 1207, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/urllib2.py", line 1177, in do_open
    raise URLError(err)
urllib2.URLError: <urlopen error [Errno 8] nodename nor servname provided, or not known>

And when I attempt to try to change the referer header (instead of the User-agent) I get formatting errors instead of it even getting to the request even though the format is identical to the one that it didn't complain about for the user-agent.

I'm still very much a new programmer so if I'm missing something blatant then I'm terribly sorry, but I have tried everything I can think of. Thanks in advance, cheers.


OK so I switched my strategy, and it worked. Unfortunately, I have no idea why the below code worked, and the stuff above kept erroring me, but I have seen a couple of similar-ish questions (no specific answers) around the google so I figured I should post it.

vlz, appreciate the help, cheers.

def faeRequest2(urltofae):
    r = urllib2.Request('http://fae.cita.illinois.edu/evaluate/link/', headers={'User-agent':'Mozilla/5.0', 'Referer':urltofae})
    c = urllib2.urlopen(r)
    print c.read()
4

1 回答 1

1

我在那里看不到任何错误。网址正确吗?尝试使用

'http://fae.cita.uiuc.edu/evaluate/link/'

代替

'http://www.fae.cita.uiuc.edu/evaluate/link/'

后者似乎没有领先任何地方。

于 2013-03-27T06:51:55.367 回答