python - 等待 API 调用成功响应

Question

我正在使用 Yahoo Api，除了添加了硬睡眠之外，我还实现了随机睡眠方法，但我仍然无法弄清楚如果我在第一次尝试时没有得到响应，我该如何等待或重试。

例如，我在下面放置的代码完全随机地在某些用户中失败。失败后，我在浏览器上获取 url，它就像一个魅力。所以我的问题是为什么？我该如何解决这个问题？或者我可以改进此代码以在艰难睡眠后执行另一个请求（仅当那是一个好方法时）

我忘了添加更多信息，我更改了代码以获取我的 http 成功代码：

print urlobject.getcode()

它返回 200，但没有 json，因为有些人认为这可能是节流。

注意：我已经从 url 中删除了我的 appid(Key)

# return the json question for given question id
def returnJSONQuestion(questionId):
    randomSleep()
    url = 'http://answers.yahooapis.com/AnswersService/V1/getQuestion?appid=APPIDREMOVED8&question_id={0}&output=json'
    format_url = url.format(questionId)
    try:
        request = urllib2.Request(format_url)
        urlobject = urllib2.urlopen(request)
        time.sleep(10)
        jsondata = json.loads(urlobject.read().decode("utf-8"))
        print jsondata
    except urllib2.HTTPError, e:
        print e.code
        logging.exception("Exception")
    except urllib2.URLError, e:
        print e.reason
        logging.exception("Exception")
    except(json.decoder.JSONDecodeError,ValueError):
        print 'Question ID ' + questionId + ' Decode JSON has failed'
        logging.info("This qid didn't work " + questionId)
    return jsondata

score 5 · Accepted Answer

好的，首先，有几点不能直接回答您的问题，但可能会有所帮助：

1) 我很确定在调用 urllib2.urlopen 和读取返回的 addinfourl 对象之间永远不需要等待。http://docs.python.org/library/urllib2.html#examples上的示例没有任何此类睡眠。

2)

json.loads(urlobject.read().decode("utf-8"))

可以简化为

json.load(urlobject)

这更简单，更具可读性。基本上， .load 接受一个类似文件的对象作为参数，而 .loads 接受一个字符串。您可能认为必须先 read() 才能从 utf-8 解码数据，但这实际上没有问题，因为 .load 默认假定它正在读取的对象是 ascii 或 utf- 8 编码（参见http://docs.python.org/library/json.html#json.load）。

3）对于您目前的目的可能无关紧要，但我认为您在这里的异常处理很糟糕。如果在“try:”块期间出现任何问题，则不会分配变量 jsondata。然后，当我们在 try/except 块结束后尝试返回它时，由于尝试使用未分配的变量，将引发 NameError。这意味着如果您的应用程序中的某个其他函数调用 returnJSONQuestion 并发生异常，那么外部函数看到的将是 NameError，而不是原始异常，并且外部函数生成的任何回溯都不会指向真正的问题发生了。当试图找出问题所在时，这很容易引起混乱。因此，如果你所有的'except'块都以'raise'结束会更好。

4) 在 Python 中，最好将注释说明函数的作用作为 docstrings（参见http://www.python.org/dev/peps/pep-0257/#what-is-a-docstring）而不是作为函数上方的注释。

无论如何，要实际回答您的问题...

出于各种原因尝试打开 URL 时，您可能会收到看似随机的 URLError。在处理您的请求期间，服务器上可能存在错误；可能存在连接问题并且丢失了一些数据；也许服务器关闭了几秒钟，而其中一位管理员更改了设置或推送了更新；也许完全是另一回事。在进行了一些 Web 开发之后，我注意到一些服务器比其他服务器更可靠，但我认为对于大多数实际用途而言，您可能不需要担心原因。最简单的做法就是重试请求，直到成功。

考虑到以上所有内容，下面的代码可能会满足您的需求：

def returnJSONQuestion(questionId):
    """return the json question for given question id"""

    url = 'http://answers.yahooapis.com/AnswersService/V1/getQuestion?appid=APPIDREMOVED8&question_id={0}&output=json'
    format_url = url.format(questionId)
    try:
        request = urllib2.Request(format_url)

        # Try to get the data and json.load it 5 times, then give up
        tries = 5
        while tries >= 0:
            try:
                urlobject = urllib2.urlopen(request)
                jsondata = json.load(urlobject)
                print jsondata
                return jsondata
            except:
                if tries == 0:
                    # If we keep failing, raise the exception for the outer exception
                    # handling to deal with
                    raise
                else:
                    # Wait a few seconds before retrying and hope the problem goes away
                    time.sleep(3) 
                    tries -= 1
                    continue

    except urllib2.HTTPError, e:
        print e.code
        logging.exception("Exception")
        raise
    except urllib2.URLError, e:
        print e.reason
        logging.exception("Exception")
        raise
    except(json.decoder.JSONDecodeError,ValueError):
        print 'Question ID ' + questionId + ' Decode JSON has failed'
        logging.info("This qid didn't work " + questionId)
        raise

希望这可以帮助！如果您要在程序中发出许多不同的 Web 请求，您可能希望将这个“异常重试请求”逻辑抽象到某个函数中，这样您就不需要样板重试逻辑与其他东西混合在一起。:)

score 2 · Accepted Answer

我遇到过很多这样的问题。我通常像这样实现我的 API 请求包装器或浏览器“get”：

def get_remote( url , attempt=0 ):
   try :
       request = urllib2.Request(format_url)
       urlobject = urllib2.urlopen(request)
       ...
       return data
   except urllib2.HTTPError , error:
       if error.code in ( 403 , 404 ):
           if attempt < MAX_ATTEMPTS :
                return get_remote( url , attempt=attempt+1 )
       raise

根据 url 或尝试，我还将更改请求参数。例如，某些网站会阻止 Python 识别的浏览器——所以如果它们与正则表达式匹配，我会将用户代理换成 Firefox。或者：如果第一次尝试失败，我可能总是在第二次请求上尝试 Firefox/Safari，或者在后续尝试之间实现随机超时。

score 1 · Accepted Answer

我不知道失败的原因，这可能是一些雅虎限制（或者可能不是），但实际上，保存问题 ID 是个好主意，这会导致失败并稍后重试。

这很容易做到。稍微修改一下函数：

def returnJSONQuestion(questionId):
    randomSleep()
    jsondata = None
    url = 'http://answers.yahooapis.com/AnswersService/V1/getQuestion?appid=APPIDREMOVED8&question_id={0}&output=json'
    format_url = url.format(questionId)
    try:
        request = urllib2.Request(format_url)
        urlobject = urllib2.urlopen(request)
        time.sleep(10)
        jsondata = json.loads(urlobject.read().decode("utf-8"))
        print jsondata
    except urllib2.HTTPError, e:
        print e.code
        logging.exception("Exception")
    except urllib2.URLError, e:
        print e.reason
        logging.exception("Exception")
    except(json.decoder.JSONDecodeError,ValueError):
        print 'Question ID ' + questionId + ' Decode JSON has failed'
        logging.info("This qid didn't work " + questionId)
    return jsondata

在任何失败的情况下，这个函数都会返回 None。所以，你可以检查结果，如果它是 None - 将问题 id 存储在某个列表中，然后重试大约 3 次。也许第二次会更幸运。

当然，您也可以修改函数，它会在出错时同时重试几次请求，但第一个解决方案对我来说看起来更可取。

顺便说一句，将 'User-Agent' 标头设置为某个真实的浏览器值——在这种情况下通常也是一个好主意，例如 Google 在许多情况下不会为此类“robo-parsers”返回结果

python - 等待 API 调用成功响应

3 回答 3

Related

Reference