python - 如何使用 python 在 BigQuery 中执行 job.insert？我得到“需要登录”，但可以列出所有表和数据集

Question

我一直在尝试让 API 客户端在 Ruby 中工作以执行插入工作以从云存储中获取数据并将其放入 BigQuery 中的表中，但并不太成功。过去，我查看了 Python API 并在 Ruby 中进行了一些操作，但这让我感到困惑。

import httplib2
import urllib2

from apiclient.discovery import build
from oauth2client.client import SignedJwtAssertionCredentials


def loadTable(service, projectId, datasetId, targetTableId):
  try:
    jobCollection = service.jobs()
    jobData = {
      'projectId': XXXXXXXXX,
      'configuration': {
          'load': {
            'sourceUris': ["gs://person-bucket/person_json.tar.gz"],
            'schema': {
              'fields'=> [
                  { 'name'=>'person_id', 'type'=>'integer' },
                  { 'name'=> 'person_name', 'type'=>'string' },
                  { 'name'=> 'logged_in_at', 'type'=>'timestamp' },
                ]
            },
            'destinationTable': {
              'projectId': XXXXXXXXX,
              'datasetId': 'personDataset',
              'tableId': 'person'
            },
          }
        }
      }

    insertResponse = jobCollection.insert(projectId=projectId, body=jobData).execute()

    # Ping for status until it is done, with a short pause between calls.
    import time
    while True:
      job = jobCollection.get(projectId=projectId,
                                 jobId=insertResponse['jobReference']['jobId']).execute()
      if 'DONE' == job['status']['state']:
          print 'Done Loading!'
          return

      print 'Waiting for loading to complete...'
      time.sleep(10)

    if 'errorResult' in job['status']:
      print 'Error loading table: ', pprint.pprint(job)
      return

  except urllib2.HTTPError as err:
    print 'Error in loadTable: ', pprint.pprint(err.resp)



PROJECT_NUMBER = 'XXXXXXXXX'
SERVICE_ACCOUNT_EMAIL = 'XXXXXXXXX@developer.gserviceaccount.com'

f = file('key.p12', 'rb')
key = f.read()
f.close()

credentials = SignedJwtAssertionCredentials(
    SERVICE_ACCOUNT_EMAIL,
    key,
    scope='https://www.googleapis.com/auth/bigquery')

http = httplib2.Http()
http = credentials.authorize(http)

service = build('bigquery', 'v2')
tables = service.tables()
response = tables.list(projectId=PROJECT_NUMBER, datasetId='person_dataset').execute(http)

print(response)
print("-------------------------------")


loadTable(service, PROJECT_NUMBER, "person_dataset", "person_table")

当我要求表格列表时，我必须获得授权，并且可以查看表格详细信息，但似乎无法使用从云存储导入的数据创建表格。

这是我在控制台中得到的输出：

No handlers could be found for logger "oauth2client.util"
{u'totalItems': 2, u'tables': [{u'kind': u'bigquery#table', u'id': u'xxx:xxx.xxx', u'tableReference': {u'projectId': u'xxx', u'tableId': u'xxx', u'datasetId': u'xxx'}}, {u'kind': u'bigquery#table', u'id': u'xxx:xxx.yyy', u'tableReference': {u'projectId': u'xxx', u'tableId': u'yyy', u'datasetId': u'xxx'}}], u'kind': u'bigquery#tableList', u'etag': u'"zzzzzzzzzzzzzzzz"'}
Traceback (most recent call last):
  File "test.py", line 96, in <module>
    loadTable(service, PROJECT_NUMBER, "person_dataset", "person_table")
  File "test.py", line 50, in loadTable
    body=jobData).execute()
  File "/usr/local/lib/python2.7/dist-packages/oauth2client-1.2-py2.7.egg/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/google_api_python_client-1.2-py2.7.egg/apiclient/http.py", line 723, in execute
    raise HttpError(resp, content, uri=self.uri)
apiclient.errors.HttpError: <HttpError 401 when requesting https://www.googleapis.com/bigquery/v2/projects/xxxxxxxx/jobs?alt=json returned "Login Required">

有人可以告诉我我做错了什么或指出我正确的方向吗？

任何帮助将非常感激。

谢谢，祝你有美好的一天。

score 1 · Accepted Answer

我不是 ruby 开发人员，但我相信当您调用时，build('bigquery', 'v2')您应该传递授权的 http 对象。使用的方法似乎与 python 相同——相关示例如下：https ://developers.google.com/api-client-library/python/samples/authorized_api_cmd_line_calendar.py

score 0 · Accepted Answer

感谢那。问题已解决：如果其他人有兴趣，请看这里：How to import a json from a file on cloud storage to Bigquery

python - 如何使用 python 在 BigQuery 中执行 job.insert？我得到“需要登录”，但可以列出所有表和数据集

2 回答 2

Related

Reference