2

每当我尝试使用支持站点上显示的方法获取云存储桶对象和输入时,我都会收到错误消息

google.api_core.exceptions.InvalidArgument: 400 gcs_content_uri 中指定的 GCS 对象不存在。

打印时 gcs 参考如下所示:

gs://lang-docs-in/b'doc1.txt'

我已经尝试了一切以使其正常工作:编码、解码等几个小时,但似乎无济于事。有什么想法吗?

主文件

import sys
from google.cloud import language
from google.cloud import storage

storage_client = storage.Client()

DOCUMENT_BUCKET = 'lang-docs-out'

def process_document(data, context):
    # Get file attrs
    bucket = storage_client.get_bucket(data['bucket'])
    blob = bucket.get_blob(data['name'])
    # send to NLP API
    gcs_obj = 'gs://{}/{}'.format(bucket.name, blob.name.decode('utf-8'))
    print('LOOK HERE')
    print(gcs_obj)
    parsed_doc = analyze_document(bucket, blob)
    # Upload the resampled image to the other bucket
    bucket = storage_client.get_bucket(DOCUMENT_BUCKET)
    newblob = bucket.blob('parsed-' + data['name'])     
    newblob.upload_from_string(parsed_doc)

def analyze_document(bucket, blob):
    language_client = language.LanguageServiceClient()
    gcs_obj = 'gs://{}/{}'.format(bucket.name, blob.name.decode('utf-8'))
    print(gcs_obj)
    document = language.types.Document(gcs_content_uri=gcs_obj, language='en', type='PLAIN_TEXT')
    response = language_client.analyze_syntax(document=document, encoding_type= get_native_encoding_type())
    return response

def get_native_encoding_type():
    """Returns the encoding type that matches Python's native strings."""
    if sys.maxunicode == 65535:
        return 'UTF16'
    else:
        return 'UTF32'

要求.txt

google-cloud-storage
google-cloud-language
google-api-python-client
grpcio
grpcio-tools
4

1 回答 1

0

实例的name属性google.cloud.storage.blob.Blob应该是一个字符串,因此您根本不需要这样做.decode()

您似乎确实有一个名为的文件,该文件"b'doc1.txt'"是由于将文件添加到 GCS 而不是您的云函数的问题而创建的,例如:

>>> blob.name
"b'doc1.txt'"
>>> type(blob.name)
<class 'str'>

并不是:

>>> blob.name
b'doc1.txt'
>>> type(blob.name)
<class 'bytes'>

这真的很难区分,因为它们在打印时看起来是一样的:

>>> print(b'hi')
b'hi'
>>> print("b'hi'")
b'hi'
于 2018-10-15T19:41:01.970 回答