python - 尝试/除了使用 Python requests 模块的正确方法？

Question

try:
    r = requests.get(url, params={'s': thing})
except requests.ConnectionError, e:
    print e #should I also sys.exit(1) after this?

这个对吗？有没有更好的方法来构建它？这会覆盖我所有的基地吗？

score 1113 · Accepted Answer

查看请求异常文档。简而言之：

如果出现网络问题（例如 DNS 故障、连接被拒绝等），Requests 将引发ConnectionError异常。

如果出现罕见的无效 HTTP 响应，Requests 将引发HTTPError异常。

如果请求超时，Timeout则会引发异常。

如果请求超过配置的最大重定向次数，TooManyRedirects则会引发异常。

Requests 显式引发的所有异常都继承自requests.exceptions.RequestException.

要回答您的问题，您所展示的内容不会涵盖您的所有基础。您只会捕获与连接相关的错误，而不是超时的错误。

捕获异常时该怎么做实际上取决于脚本/程序的设计。可以接受退出吗？你可以继续再试一次吗？如果错误是灾难性的并且您无法继续，那么是的，您可以通过引发SystemExit来中止程序（打印错误和调用的好方法sys.exit）。

您可以捕获基类异常，它将处理所有情况：

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.RequestException as e:  # This is the correct syntax
    raise SystemExit(e)

或者你可以分别捕捉它们并做不同的事情。

try:
    r = requests.get(url, params={'s': thing})
except requests.exceptions.Timeout:
    # Maybe set up for a retry, or continue in a retry loop
except requests.exceptions.TooManyRedirects:
    # Tell the user their URL was bad and try a different one
except requests.exceptions.RequestException as e:
    # catastrophic error. bail.
    raise SystemExit(e)

正如克里斯蒂安所指出的：

如果您希望 http 错误（例如 401 Unauthorized）引发异常，您可以调用Response.raise_for_status. HTTPError如果响应是 http 错误，那将引发.

一个例子：

try:
    r = requests.get('http://www.google.com/nothere')
    r.raise_for_status()
except requests.exceptions.HTTPError as err:
    raise SystemExit(err)

将打印：

404 Client Error: Not Found for url: http://www.google.com/nothere

score 163 · Accepted Answer

一项额外的建议是明确的。似乎最好从错误堆栈中的特定到一般，以获取要捕获的所需错误，因此特定错误不会被一般错误掩盖。

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)

Http Error: 404 Client Error: Not Found for url: http://www.google.com/blahblah

对比

url='http://www.google.com/blahblah'

try:
    r = requests.get(url,timeout=3)
    r.raise_for_status()
except requests.exceptions.RequestException as err:
    print ("OOps: Something Else",err)
except requests.exceptions.HTTPError as errh:
    print ("Http Error:",errh)
except requests.exceptions.ConnectionError as errc:
    print ("Error Connecting:",errc)
except requests.exceptions.Timeout as errt:
    print ("Timeout Error:",errt)     

OOps: Something Else 404 Client Error: Not Found for url: http://www.google.com/blahblah

score 36 · Accepted Answer

异常对象还包含原始响应e.response，如果需要查看服务器响应的错误正文，这可能很有用。例如：

try:
    r = requests.post('somerestapi.com/post-here', data={'birthday': '9/9/3999'})
    r.raise_for_status()
except requests.exceptions.HTTPError as e:
    print (e.response.text)

score 2 · Accepted Answer

这是一种通用的做事方式，这至少意味着您不必将每个requests调用都用try ... except:

def requests_call(verb, *args, **kwargs):
    response = None
    exception = None
    try:
        response = requests.request(verb, *args, **kwargs)
    except BaseException as e:
        raw_tb = traceback.extract_stack()
        if 'data' in kwargs:
            if len(kwargs['data']) > 500: # anticipate giant data string
                kwargs['data'] = f'{kwargs["data"][:500]}...' 
        msg = f'{e.__class__.__name__}: {e}\n' \
            + f'verb |{verb}| args {args} kwargs {kwargs}\n\n' \
            + 'Stack trace:\n' + ''.join(traceback.format_list(raw_tb[:-1]))
        logger.error(msg) # see note about logger.exception(..) below           
        exception = e
    return (response, exception)

注意

请注意ConnectionError哪个是内置的，与 * 类无关requests.ConnectionError。我认为后者在这种情况下更常见，但没有真正的想法......
根据文档，在检查未None返回requests.RequestException的异常时，所有requests异常（包括requests.ConnectionError）的超类不是~~“ requests.exceptions.RequestException”~~。也许自从接受答案以来它已经改变了。**
显然，这假设已经配置了一个记录器。logger.exception在块中调用except似乎是个好主意，但这只会在此方法中提供堆栈！相反，获取导致调用此方法的跟踪。然后记录（包含异常的详细信息以及导致问题的调用的详细信息）

*我看了源码：requests.ConnectionErrorsubclasses the single class requests.RequestException, which subclasses the single class IOError(builtin)

**但是，在撰写本文时（2022 年 2 月），您在本页底部找到“requests.exceptions.RequestException”......但它链接到上述页面：令人困惑。

用法很简单：

search_response, exception = utilities.requests_call('get',
    f'http://localhost:9200/my_index/_search?q={search_string}')

首先，您检查响应：如果None发生了有趣的事情，并且您将遇到异常，必须根据上下文（以及异常）以某种方式对其进行处理。在 Gui 应用程序 (PyQt5) 中，我通常实现一个“可视日志”以向用户提供一些输出（同时也记录到日志文件），但添加的消息应该是非技术性的。因此，通常可能会出现这样的情况：

if search_response == None:
    # you might check here for (e.g.) a requests.Timeout, tailoring the message
    # accordingly, as the kind of error anyone might be expected to understand
    msg = f'No response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    return
response_json = search_response.json()
if search_response.status_code != 200: # NB 201 ("created") may be acceptable sometimes... 
    msg = f'Bad response searching on |{search_string}|. See log'
    MainWindow.the().visual_log(msg, log_level=logging.ERROR)
    # usually response_json will give full details about the problem
    log_msg = f'search on |{search_string}| bad response\n{json.dumps(response_json, indent=4)}'
    logger.error(log_msg)
    return

# now examine the keys and values in response_json: these may of course 
# indicate an error of some kind even though the response returned OK (status 200)...

鉴于堆栈跟踪是自动记录的，您通常不需要更多...

但是，要跨越 Ts：

如果如上所述，异常给出消息"No response"和非 200 状态"Bad response"，我建议

响应的 JSON 结构中缺少预期的键应导致消息“异常响应”
消息“意外响应”的超出范围或奇怪的值
True以及消息“错误响应”中存在带有值或其他值的键，例如“错误”或“错误”

这些可能会或可能不会阻止代码继续。

......事实上，在我看来，让这个过程更加通用是值得的。对我来说，下一个方法通常将使用上述方法的 20 行代码减少requests_call到大约 3 行，并使您的大部分处理和日志消息标准化。在您的项目中多次requests调用，代码变得严重臃肿：

def process_requests_call(verb, *args, **kwargs):
    call_name = ''
    if 'call_name' in kwargs:
        call_name = kwargs['call_name']
        del kwargs['call_name']

    required_keys = {}    
    if 'required_keys' in kwargs:
        required_keys = kwargs['required_keys']
        del kwargs['required_keys']
        
    acceptable_statuses = [200]
    if 'acceptable_statuses' in kwargs:
        acceptable_statuses = kwargs['acceptable_statuses']
        del kwargs['acceptable_statuses']
        
    def log_error(response_type, *args):
        if response_type == 'No': # i.e. an exception was raised...
            if isinstance(args[0], requests.Timeout):
                MainWindow.the().visual_log(f'Time out of {call_name} before response received!', logging.ERROR)
                return
        else:
            # if we get here no exception has been raised, so no stack trace has yet been logged.  
            # a response has been returned, but is either "Bad" or "Anomalous"
            raw_tb = traceback.extract_stack()
            if 'data' in kwargs and len(kwargs['data']) > 500: # anticipate giant data string
                kwargs['data'] = f'{kwargs["data"][:500]}...' 
            call_and_response_details = f'{response_type} response\n{response.added_message}\n' \
                + f'verb |{verb}| args {args}, kwargs {kwargs}\nresponse:\n{json.dumps(response_json, indent=4)}'
            logger.error(f'{call_and_response_details}\nStack trace: {"".join(traceback.format_list(raw_tb[:-1]))}')
            del response.added_message
        MainWindow.the().visual_log(f'{response_type} response {call_name}. See log.', logging.ERROR)
        
    exception_handler = log_error
    if 'exception_handler' in kwargs:
        exception_handler = kwargs['exception_handler']
        del kwargs['exception_handler']
        
    response, exception = requests_call(verb, *args, **kwargs)
    
    if response == None:
        exception_handler('No', exception)
        return (False, exception)
    
    response_json = response.json()
    
    status_ok = False
    for acceptable_status in acceptable_statuses:
        if response.status_code == acceptable_status:
            status_ok = True
            break
    if not status_ok:
        response.added_message = ''
        log_error('Bad')
        return (False, response)
    
    def check_keys(req_dict_structure, response_dict_structure):
        if not isinstance(response_dict_structure, dict):
            response.added_message = f'response_dict_structure not dict: {type(response_dict_structure)}\n'
            return False
        for dict_key in req_dict_structure.keys():
            if dict_key not in response_dict_structure:
                response.added_message = f'key |{dict_key}| missing\n'
                return False
            req_value = req_dict_structure[dict_key]
            response_value = response_dict_structure[dict_key]
            if isinstance(req_value, dict):
                if not check_keys(req_value, response_value):
                    return False
            elif isinstance(req_value, list):
                if not isinstance(response_value, list):
                    response.added_message = f'key |{dict_key}| not list: {type(response_value)}\n'
                    return False
                # it is OK for the value to be a list, but these must be strings (keys) or dicts
                for req_list_element, resp_list_element in zip(req_value, response_value):
                    if isinstance(req_list_element, dict):
                        if not check_keys(req_list_element, resp_list_element):
                            return False
                    if not isinstance(req_list_element, str):
                        response.added_message = f'req_list_element not string: {type(req_list_element)}\n'
                        return False
                    if req_list_element not in response_value:
                        response.added_message = f'key |{req_list_element}| missing\n'
                        return False
            # put None as a dummy value (otherwise something like {'my_key'} will be seen as a set, not a dict 
            elif req_value != None: 
                response.added_message = f'required value of key |{dict_key}| must be None (dummy), dict or list: {type(req_value)}\n'
                return False
                
        return True
    check_result = check_keys(required_keys, response_json)
    if not check_result:
        log_error('Anomalous')
    return (check_result, response)

示例调用：

success, deliverable = utilities.process_requests_call('get', 
    f'{ES_URL}{INDEX_NAME}/_doc/1', 
    call_name=f'getting status doc for index {INDEX_NAME}',
    required_keys={'_source':{'status_text': None}})
if not success: return False
# here, we know the deliverable is a response, not an exception
# we also don't need to check for the keys being present
index_status = deliverable.json()['_source']['status_text']
if index_status != 'successfully completed':
    # ... i.e. an example of a 200 response, but an error nonetheless
    msg = f'Error response: ES index {INDEX_NAME} does not seem to have been built OK: cannot search'
    MainWindow.the().visual_log(msg)
    logger.error(f'index |{INDEX_NAME}|: deliverable.json() {json.dumps(deliverable.json(), indent=4)}')
    return False
...

因此，例如，在缺少键的情况下，用户看到的“可视日志”消息将是“获取索引 XYZ 的状态文档的异常响应。请参阅日志。” （并且日志将显示有问题的密钥）。

注意

required_keys dict可以嵌套到任何深度
exception_handler可以通过在其中包含一个函数来完成更细粒度的异常处理kwargs（尽管不要忘记它requests_call会记录调用详细信息、异常类型和__str__以及堆栈跟踪）。
在上面，我还对kwargs可能记录的任何关键“数据”进行了检查。这是因为批量操作（例如，在 Elasticsearch 的情况下填充索引）可能包含大量字符串。例如，减少到前 500 个字符。

PS 是的，我确实知道elasticsearchPython 模块（一个“瘦包装器” requests）。以上所有内容仅用于说明目的。

python - 尝试/除了使用 Python requests 模块的正确方法？

4 回答 4

Related

Reference