1

GCP python 文档有一个具有以下功能的脚本:

def upload_pyspark_file(project_id, bucket_name, filename, file):
      """Uploads the PySpark file in this directory to the configured
      input bucket."""
      print('Uploading pyspark file to GCS')
      client = storage.Client(project=project_id)
      bucket = client.get_bucket(bucket_name)
      blob = bucket.blob(filename)
      blob.upload_from_file(file)

我在脚本中创建了一个参数解析函数,该函数接受多个参数(文件名)以上传到 GCS 存储桶。我正在尝试调整上述函数来解析这些多个参数并上传这些文件,但不确定如何继续。我的困惑在于上面的“文件名”和“文件”变量。如何根据我的特定目的调整该功能?

4

1 回答 1

2

我不认为你还在寻找这样的东西?

from google.cloud import storage
import os

files = os.listdir('data-files')
client = storage.Client.from_service_account_json('cred.json')
bucket = client.get_bucket('xxxxxx')


def upload_pyspark_file(filename, file):
    # """Uploads the PySpark file in this directory to the configured
    # input bucket."""
    # print('Uploading pyspark file to GCS')
    # client = storage.Client(project=project_id)
    # bucket = client.get_bucket(bucket_name)
    print('Uploading from ', file, 'to', filename)
    blob = bucket.blob(filename)
    blob.upload_from_file(file)


for f in files:
    upload_pyspark_file(f, "data-files\\{0}".format(f))

正如您可能已经猜到的那样,file和之间的区别是源文件和目标文件。filenamefilefilename

于 2017-11-28T23:02:38.630 回答