0

我在以下代码(源自本教程load_table_from_uri())中使用的方法,它创建了本机表:bigquery.Client()

from google.cloud import bigquery

def main():
    ''' Load all tables '''
    client = bigquery.Client()
    bq_load_file_in_gcs(
        client,
        'gs://bucket_name/data100rows.csv',
        'CSV',
        'test_data.data100_csv_native'
    )

def bq_load_file_in_gcs(client, path, fmt, table_name):
    '''
        Load BigQuery table from Google Cloud Storage

        client - bigquery client
        path - 'gs://path/to/upload.file',
        fmt -   The format of the data files. "CSV" / "NEWLINE_DELIMITED_JSON".
                https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs#configuration.load.sourceFormat
        table_name - table with datasouce
    '''

    job_config = bigquery.LoadJobConfig()
    job_config.autodetect = True
    job_config.skip_leading_rows = 1
    job_config.source_format = fmt

    load_job = client.load_table_from_uri(
        path,
        table_name,
        job_config=job_config
    )

    assert load_job.job_type == 'load'

    load_job.result()  # Waits for table load to complete.

    assert load_job.state == 'DONE'

我还需要能够创建外部表,就像我在 BigQuery UI 中可以做的那样:

BigQuery 界面的屏幕截图

但我无法找到在作业配置或方法参数中设置表类型的位置。这是可能的,如果是的话 - 如何?

4

1 回答 1

2

示例在外部配置一章中。

基本上,您需要使用表对象的外部配置,例如:

table = bigquery.Table(.........)

external_config = bigquery.ExternalConfig('CSV')
source_uris = ['<url-to-your-external-source>'] #i.e for a csv file in a Cloud Storage bucket 

external_config.source_uris = source_uris
table.external_data_configuration = external_config
于 2018-11-29T20:23:30.913 回答