1

概述

我在 Kubernetes 上部署了一个 Presto 集群,并尝试通过兼容的 API 将其连接到 S3 存储桶。

如何复制

我使用Presto Operator(由 Starburst 提供)并使用以下属性配置我的 Presto 资源:

  hive:
    additionalProperties: |
      connector.name=hive-hadoop2
      hive.metastore.uri=thrift://hive-metastore-presto-cluster-name.default.svc.cluster.local:9083
      hive.s3.endpoint=https://<s3-endpoint>

我添加了带有 AWS_ACCESS_KEY_ID: **** 和 AWS_SECRET_ACCESS_KEY: **** 等数据的 s3-secret

然后,我使用presto-cli使用以下命令创建 Schema:

CREATE SCHEMA ban WITH (LOCATION = 's3a://path/to/schema');

错误

它会在 Hive 元存储中引发错误,例如:

Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request

问题识别

我在此链接上看到“尝试使用仅支持“V4”签名 API 的任何 S3 服务时会发生这种情况——但客户端配置为使用默认的 S3 服务端点”

解决方案是:“需要通过 fs.s3a.endpoint 属性为 S3A 客户端提供要使用的端点”

但是属性fs.s3a.endpoint不适用于 Presto 的 Hive 配置。

有人已经有这个问题了吗?

有用的细节

  • Kubernetes 版本:1.18
  • Presto Operator 镜像:starburstdata/presto-operator:341-e-k8s-0.35

- - 编辑 - -

完整的堆栈跟踪

2020-09-29T12:20:34,120 ERROR [pool-7-thread-2] utils.MetaStoreUtils: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
org.apache.hadoop.fs.s3a.AWSBadRequestException: doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:212) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:372) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:308) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:115) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:141) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.determineDatabasePath(Warehouse.java:190) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:1265) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1425) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at com.sun.proxy.$Proxy26.create_database(Unknown Source) [?:?]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14861) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14845) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:104) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1337) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1277) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:373) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        ... 33 more
4

0 回答 0