我正在尝试使用 hive 连接器和 Minio 对象存储从 Presto 中选择 S3。我能够创建一个外部表并运行所有 SQL 查询。但是,S3 Select 似乎无法正常工作,即使使用hive.s3select-pushdown.enabled=true
目录文件夹中属性文件中的设置也是如此。我在 Minio 服务器上运行了数据包跟踪,我只看到了 GET/LIST 调用,没有看到任何调用POST /{Key+}?select&select-type=2 HTTP/1.1
。
下面是配置单元属性文件。
hive.metastore.uri=thrift://hadoop-master:9083
hive.s3.path-style-access=true
hive.s3.endpoint=http://X.X.X.X:9000
hive.s3.aws-access-key=minioadmin
hive.s3.aws-secret-key=minioadmin
hive.non-managed-table-writes-enabled=true
hive.storage-format=ORC
hive.s3select-pushdown.enabled=true
我看到从 presto 的 SESSION 参数中设置了相同的设置。
minio.s3_select_pushdown_enabled | true | true
minio.projection_pushdown_enabled | true | true
这就是我从 presto cli 创建外部表的方式。
presto:default> CREATE TABLE nyc_9 ( vendorid VARCHAR, tpep_pickup_datetime VARCHAR, tpep_dropoff_datetime VARCHAR, passenger_count VARCHAR, trip_distance VARCHAR, ratecodeid VARCHAR, store_and_fwd_flag VARCHAR, pulocationid VARCHAR, dolocationid VARCHAR, payment_type VARCHAR, fare_amount VARCHAR, extra VARCHAR, mta_tax VARCHAR, tip_amount VARCHAR, tolls_amount VARCHAR, improvement_surcharge VARCHAR, total_amount VARCHAR) WITH (FORMAT = 'CSV', skip_header_line_count = 1, EXTERNAL_LOCATION = 's3a://test10gb5/');
正在运行的查询
presto:default> SELECT * FROM nyc_9 WHERE trip_distance > '20' AND fare_amount > '10' AND tip_amount > '2' AND passenger_count = '2' LIMIT 10;
vendorid | tpep_pickup_datetime | tpep_dropoff_datetime | passenger_count | trip_distance | ratecodeid | store_and_fwd_flag | pulocationid | dolocationid | payment_type | fare_amount | extra | mta_tax | tip_amount | tolls_amount | improvement_sur
----------+------------------------+------------------------+-----------------+---------------+------------+--------------------+--------------+--------------+--------------+-------------+-------+---------+------------+--------------+----------------
2 | 04/26/2018 08:51:16 AM | 04/26/2018 09:42:03 AM | 2 | 5.06 | 1 | N | 236 | 170 | 1 | 31 | 0 | 0.5 | 6.36 | 0 | 0.3
2 | 04/26/2018 08:14:17 AM | 04/26/2018 08:35:08 AM | 2 | 6.88 | 1 | N | 263 | 45 | 1 | 22 | 0 | 0.5 | 6.84 | 0 | 0.3
1 | 04/26/2018 08:19:47 AM | 04/26/2018 09:17:45 AM | 2 | 9.7 | 1 | N | 138 | 144 | 1 | 39 | 0 | 0.5 | 8 | 0 | 0.3
2 | 04/26/2018 08:38:15 AM | 04/26/2018 09:09:58 AM | 2 | 4.73 | 1 | N | 142 | 144 | 1 | 22 | 0 | 0.5 | 4.56 | 0 | 0.3
2 | 04/26/2018 08:38:26 AM | 04/26/2018 09:22:12 AM | 2 | 5.95 | 1 | N | 239 | 13 | 1 | 29 | 0 | 0.5 | 2.98 | 0 | 0.3
2 | 04/26/2018 08:47:03 AM | 04/26/2018 09:17:02 AM | 2 | 3.27 | 1 | N | 158 | 162 | 1 | 19 | 0 | 0.5 | 3.96 | 0 | 0.3
2 | 04/26/2018 08:21:19 AM | 04/26/2018 08:46:55 AM | 2 | 3.89 | 1 | N | 262 | 107 | 1 | 18.5 | 0 | 0.5 | 3.86 | 0 | 0.3
2 | 04/26/2018 08:35:32 AM | 04/26/2018 09:01:54 AM | 2 | 4.09 | 1 | N | 236 | 137 | 1 | 17.5 | 0 | 0.5 | 3.66 | 0 | 0.3
1 | 04/26/2018 08:43:45 AM | 04/26/2018 09:03:41 AM | 2 | 3 | 1 | N | 163 | 145 | 1 | 15 | 0 | 0.5 | 6 | 0 | 0.3
1 | 04/26/2018 08:01:47 AM | 04/26/2018 08:13:08 AM | 2 | 3.1 | 1 | N | 264 | 137 | 1 | 12 | 0 | 0.5 | 2.55 | 0 | 0.3
(10 rows)
要使 S3 Select 正常工作,还需要做些什么吗?