python - pyGithub最大使用API调用率

Question

我正在尝试使用 pyGithub 库来访问 github 的 v3 API。虽然这个库使用起来很简单。我发现文档非常模糊。

下面我成功地获取了带有文件路径及其 sha 的文件的内容。我的最终目标是将我的 API 调用从 3 个减少到只有 1 个，因为我想在一小时内使用完整的 5000 个 API 调用。

from github import Github
gh = Github(access_token) # I supply an access token here.
user = gh.get_user(owner_name) # This takes 1 API call
repo = user.get_repo(repo_name) # This takes 1 API call


file = repo.get_file_contents(filename, ref=sha) # This takes 1 API call

有谁知道我如何将 repo 和所有者名称传递给 get_file_contents() 或我可以用来实现此目的的类似函数？

任何帮助表示赞赏。

score 1 · Accepted Answer

有谁知道我如何将 repo 和所有者名称传递给 get_file_contents()

鉴于当前的实现get_file_contents，它期望：

一个 GithubObject（需要 API 调用）
或字符串（不花费 API 调用费用）

但两者都依赖于一个类 Repository，它确实涉及 API 调用。
因此，如果您可以使您的流程长期存在，并且能够在单个执行会话中重用该存储库，那将是最好的。

但是，如果您必须从多个存储库中获取文件，那将无济于事。

score 1 · Accepted Answer

您可以使用格式为“owner_name/repo_name”的 get_repo() 将其从 3 个 API 调用减少到 2 个

from github import Github
gh = Github(access_token) # I supply an access token here.
repo = gh.get_repo(owner_name+'/'+repo_name) # This takes 1 API call

file = repo.get_file_contents(filename, ref=sha) # This takes 1 API call

只是在这里提到这一点以供将来参考。实际上，我最终使用了 requests 库并形成了自己的 api 调用。

像这样：

import requests
# Import python's base64 decoder
from base64 import b64decode as decode

def GET(owner_repo,filename,sha,access_token):
    # Supply Headers
    headers = {'content-type': 'application/json', 'Authorization':'token '+access_token}
    # This function is stable so long as the github API structure does not change. 
    # Also I am using the previously mentioned combo of owner/repo.
    url = 'https://api.github.com/repos/%s/contents/%s?ref=%s' % (owner_repo, filename, sha)
    # Let's stay within the API rate limits
    url_rate_limit = 'https://api.github.com/rate_limit'
    r = requests.get(url_rate_limit, headers=headers)
    rate_limit = int(r.json()['resources']['core']['remaining'])
    if(rate_limit > 0):
        # The actual request
        r = requests.get(url, headers=headers)
        # Note: you will need to handle the any error codes yourself. 
        # I am just safe checking for '200' OK
        if(r.status_code == 200):
            content = r.json()['content']
            # Decode base64 content
            decoded_content = decode(content)
            return decoded_content

我在 MIT 许可下许可上述代码。

score 0 · Accepted Answer

GitHub API 支持条件请求，缓存命中不计入速率限制：

发出有条件的请求并收到 304 响应不计入您的速率限制，因此我们鼓励您尽可能使用它。

但是，PyGithub 没有实现缓存：

https://github.com/PyGithub/PyGithub/issues/585

但是，在 GitHub3 中是可能的：

https://github.com/sigmavirus24/github3.py/issues/75#issuecomment-128345063

有一些包为请求添加了缓存：

有requests-cache，有全局补丁机制，但还不支持 HTTP 前置条件
有cachecontrol，它没有全局修补机制，但我设法通过修补一些内部结构将它与 PyGithub 集成：

gh = github.Github(token)
class CachingConnectionClass(gh._Github__requester._Requester__connectionClass):
    def __init__(self, *args, **kwargs):
        super(gh._Github__requester._Requester__connectionClass, self).__init__(*args, **kwargs)
        self.session = CacheControl(self.session,
                                    cache=FileCache('.github-cache'))
gh._Github__requester._Requester__connectionClass = CachingConnectionClass

python - pyGithub最大使用API​​调用率

3 回答 3

Related

Reference

python - pyGithub最大使用API调用率