3

hostA 已安装并运行 MySQL(3306 端口)、hive(10000 端口)和 hive Metastore(9083 端口)。hostB 已安装并运行 presto。

目标是让 hostB 运行 presto,它允许对 hostA 上的 hivemetastore 进行查询。

下面出现错误。/home/ec2-user/warehouse/contact 在 hostA 的本地文件系统(不是 hdfs/s3)上确实存在(并且表已分区)但在 hostB 上不存在,为什么 presto 试图在 presto 的本地主机上查找 hive 分区运行 (hostB) 而不是在 hostA 上运行(配置单元元存储在哪里)?Metastore 连接建立,因为 presto 能够列出 Metastore 上的表。

presto-cli --debug --catalog hive --schema default
presto:default> show tables;
           Table
----------------------------
 account
 contact
(2 rows)

Query 20171102_122934_00012_x6ppj, FINISHED, 2 nodes
http://localhost:8080/query.html?20171102_122934_00012_x6ppj
Splits: 18 total, 18 done (100.00%)
CPU Time: 0.0s total,   615 rows/s, 18.8KB/s, 5% active
Per Node: 0.0 parallelism,     8 rows/s,   280B/s
Parallelism: 0.0
0:00 [8 rows, 250B] [17 rows/s, 560B/s]

presto:default> select * from contact;
Query 20171102_122943_00013_x6ppj failed: Partition location does not exist: file:/home/ec2-user/warehouse/contact
com.facebook.presto.spi.PrestoException: Partition location does not exist: file:/home/ec2-user/warehouse/contact
        at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:102)
        at com.facebook.presto.hive.util.HiveFileIterator.computeNext(HiveFileIterator.java:41)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader.loadSplits(BackgroundHiveSplitLoader.java:243)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader.access$300(BackgroundHiveSplitLoader.java:92)
        at com.facebook.presto.hive.BackgroundHiveSplitLoader$HiveSplitLoaderTask.process(BackgroundHiveSplitLoader.java:195)
        at com.facebook.presto.hive.util.ResumableTasks.safeProcessTask(ResumableTasks.java:45)
        at com.facebook.presto.hive.util.ResumableTasks.lambda$submit$1(ResumableTasks.java:33)
        at io.airlift.concurrent.BoundedExecutor.drainQueue(BoundedExecutor.java:78)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)



cat config.properties
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery-server.enabled=true
# discovery.uri=http://example.net:8080
discovery.uri=http://hostB:8080

cat hive.properties
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hostA:9083



2017-11-02T06:52:30.585Z        INFO    main    com.facebook.presto.metadata.StaticCatalogStore -- Loading catalog etc/catalog/hive.properties --
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       PROPERTY                                           DEFAULT     RUNTIME                        DESCRIPTION
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.allow-corrupt-writes-for-testing              false       false                          Allow Hive connector to write data even when data will likely be corrupt
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.assume-canonical-partition-keys               false       false
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.bucket-execution                              true        true                           Enable bucket-aware execution: only use a single worker per bucket
2017-11-02T06:52:31.307Z        INFO    main    Bootstrap       hive.bucket-writing                                true        true                           Enable writing to bucketed tables
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.connect.max-retries                       5           5
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.connect.timeout                           500.00ms    500.00ms
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs-timeout                                   60.00s      60.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.domain-compaction-threshold                   100         100                            Maximum ranges to allow in a tuple domain without compacting it
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.domain-socket-path                        null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.fs.cache.max-size                             1000        1000                           Hadoop FileSystem cache size
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.force-local-scheduling                        false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.hdfs.authentication.type                      NONE        NONE                           HDFS authentication type
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.hdfs.impersonation.enabled                    false       false                          Should Presto user be impersonated when communicating with HDFS
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.compression-codec                             GZIP        GZIP
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.authentication.type                 NONE        NONE                           Hive Metastore authentication type
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.storage-format                                RCBINARY    RCBINARY
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.immutable-partitions                          false       false                          Can new data be inserted into existing partitions or existing unpartitioned tables
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.dfs.ipc-ping-interval                         10.00s      10.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-concurrent-file-renames                   20          20
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-initial-split-size                        32MB        32MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-initial-splits                            200         200
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-refresh-max-threads                 100         100
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-outstanding-splits                        1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.partition-batch-size.max            100         100
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-partitions-per-scan                       100000      100000                         Maximum allowed partitions for a single table scan
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-partitions-per-writers                    100         100                            Maximum number of partitions per writer
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-split-iterator-threads                    1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.max-split-size                                64MB        64MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-cache-maximum-size                  10000       10000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-cache-ttl                           0.00s       0.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-refresh-interval                    0.00s       0.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.thrift.client.socks-proxy           null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore-timeout                             10.00s      10.00s
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.metastore.partition-batch-size.min            10          10
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.bloom-filters.enabled                     false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.default-bloom-filter-fpp                  0.05        0.05                           ORC Bloom filter false positive probability
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-buffer-size                           8MB         8MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-merge-distance                        1MB         1MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.max-read-block-size                       16MB        16MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.optimized-writer.enabled                  false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.orc.stream-buffer-size                        8MB         8MB
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.parquet-optimized-reader.enabled              false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.parquet-predicate-pushdown.enabled            false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.per-transaction-metastore-cache-maximum-size  1000        1000
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.rcfile-optimized-writer.enabled               true        true
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.rcfile.writer.validate                        false       false                          Validate RCFile after write by re-reading the whole file
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.recursive-directories                         false       false
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.config.resources                              null        null
2017-11-02T06:52:31.309Z        INFO    main    Bootstrap       hive.respect-table-format                          true        true                           Should new partitions be written using the existing table format or the default Presto format
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.skip-deletion-for-alter                       false       false                          Skip deletion of old partition data when a partition is deleted and then inserted in the same transaction
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.table-statistics-enabled                      true        true                           Enable use of table statistics
2017-11-02T06:52:31.310Z        INFO    main    Bootstrap       hive.time-zone                                     Zulu        Zulu
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.orc.use-column-names                          false       false                          Access ORC columns using names from the file
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.parquet.use-column-names                      false       false                          Access Parquet columns using names from the file
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.dfs.verify-checksum                           true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.write-validation-threads                      16          16                             Number of threads used for verifying data after a write
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.non-managed-table-writes-enabled              false       false                          Enable writes to non-managed (external) tables
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.pin-client-to-current-region               false       false                          Should the S3 client be pinned to the current EC2 region
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.aws-access-key                             null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.aws-secret-key                             [REDACTED]  [REDACTED]
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.connect-timeout                            5.00s       5.00s
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.encryption-materials-provider              null        null                           Use a custom encryption materials provider for S3 data encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.endpoint                                   null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.kms-key-id                                 null        null                           Use an AWS KMS key for S3 data encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-backoff-time                           10.00m      10.00m
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-client-retries                         5           5
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-connections                            500         500
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-error-retries                          10          10
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.max-retry-time                             10.00m      10.00m
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.multipart.min-file-size                    16MB        16MB                           Minimum file size for an S3 multipart upload
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.multipart.min-part-size                    5MB         5MB                            Minimum part size for an S3 multipart upload
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.signer-type                                null        null
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.socket-timeout                             5.00s       5.00s
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.enabled                                false       false                          Enable S3 server side encryption
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.kms-key-id                             null        null                           KMS Key ID to use for S3 server-side encryption with KMS-managed key
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.sse.type                                   S3          S3                             Key management type for S3 server-side encryption (S3 or KMS)
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.ssl.enabled                                true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.staging-directory                          /tmp        /tmp                           Temporary directory for staging files before uploading to S3
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.use-instance-credentials                   true        true
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.s3.user-agent-prefix                                                                     The user agent prefix to use for S3 calls
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.metastore.uri                                 null        [thrift://hostA:9083]  Hive metastore URIs (comma separated)
2017-11-02T06:52:31.311Z        INFO    main    Bootstrap       hive.metastore                                     thrift      thrift
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-add-column                              false       false                          Allow Hive connector to add column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-drop-column                             false       false                          Allow Hive connector to drop column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-drop-table                              false       false                          Allow Hive connector to drop table
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-rename-column                           false       false                          Allow Hive connector to rename column
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.allow-rename-table                            false       false                          Allow Hive connector to rename table
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap       hive.security                                      legacy      legacy
2017-11-02T06:52:31.312Z        INFO    main    Bootstrap
2017-11-02T06:52:32.663Z        INFO    main    com.facebook.presto.metadata.StaticCatalogStore -- Added catalog hive using connector hive-hadoop2 --
4

0 回答 0