1

我在本地目录中有一个活泼的压缩镶木地板文件/home/hive/part-00000-52d40ae4-92cd-414c-b4f7-bfa795ee65c8-c000.snappy.parque

当使用以下命令创建外部配置单元表时,它会被执行,但是当 select * from parquet_hive123456789 运行时,不会返回任何行。

CREATE EXTERNAL TABLE parquet_hive123456789 (
  `ip` string,
  `request` string,
  `status` string,
  `userid` string,
  `bytes` string,
  `agent` string,
  `timestamp` timestamp
) STORED AS PARQUET
LOCATION '/home/hive/';

通过 parquet-tools 我可以看到文件中的内容。

parquet-tools show part-00000-52d40ae4-92cd-414c-b4f7-bfa795ee65c8-c000.snappy.parquet

+-----------------+-------------------------------------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------+-------------+
| ip              | request                             |   status |   userid |   bytes | agent                                                                                                               | timestamp   |
|-----------------+-------------------------------------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------+-------------|
| 222.203.236.146 | GET /site/user_status.html HTTP/1.1 |      405 |       13 |   14096 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 122.152.45.245  | GET /site/login.html HTTP/1.1       |      407 |        5 |     278 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 222.152.45.45   | GET /site/user_status.html HTTP/1.1 |      302 |       22 |    4096 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 222.245.174.248 | GET /index.html HTTP/1.1            |      404 |        7 |   14096 | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)                                            | NaT         |
| 122.173.165.203 | GET /index.html HTTP/1.1            |      200 |       39 |     278 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 122.168.57.222  | GET /images/logo-small.png HTTP/1.1 |      404 |        2 |   14096 | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)                                            | NaT         |
| 122.152.45.245  | GET /images/track.png HTTP/1.1      |      405 |        5 |     278 | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)                                            | NaT         |
| 122.173.165.203 | GET /site/user_status.html HTTP/1.1 |      407 |       39 |   14096 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 222.245.174.248 | GET /images/track.png HTTP/1.1      |      302 |        7 |     278 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
| 122.173.165.203 | GET /site/user_status.html HTTP/1.1 |      200 |       39 |   14096 | Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 | NaT         |
+-----------------+-------------------------------------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------+-------------+

有人可以帮忙吗?

4

1 回答 1

0

LOCATION应该是 HDFS 目录,而不是本地。HDFS 中也可能存在像“/home/hive”这样的目录,但是这样命名表位置是个坏主意。它应该是特定于表的名称,因为所有表数据都应该在自己的位置,与其他表分开。通常表目录如下所示:/user/hadoop/mytable - 其中 mytable 是表名。

将文件放入 HDFS 目录。例如像这样(在 HDFS 中使用你的路径):

hdfs dfs -put  /home/hive/part-00000-52d40ae4-92cd-414c-b4f7-bfa795ee65c8-c000.snappy.parque /user/hadoop/table_dir/

检查 HDFS 中是否存在文件(使用您的 HDFS 路径):

hdfs dfs -ls '/user/hadoop/table_dir/'

然后使用 HDFS 中的位置创建表(EXTERNAL 或 MANAGED,在此上下文中无关紧要):'/user/hadoop/table_dir/'

或者,您可以创建表,然后使用此答案LOAD DATA LOCAL INPATH中的命令将本地文件加载到其中。

于 2021-03-30T20:15:57.773 回答