0

我有 2 个外部配置单元表,如下所示。我已经使用 sqoop 从 oracle 中填充了数据。

create external table transaction_usa
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_usa';

create external table transaction_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
row format delimited
stored as textfile
location '/user/stg/bank_stg/tran_canada';

现在我想合并以上 2 个表数据,因为它在 1 个外部配置单元表中,其所有字段与上述 2 个表中的所有字段相同,但有 1 个额外列来标识哪些数据来自哪个表。具有附加列的新外部表为source_table. 新的外部表如下。

create external table transaction_usa_canada
(
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int,
source_table string
)
row format delimited
stored as textfile
location '/user/gds/bank_ds/tran_usa_canada';

我该怎么做。?

4

3 回答 3

1

SELECT从每个表中执行UNION ALL操作并对这些结果执行操作,最后将结果插入到您的第三个表中。

以下是最终的配置单元查询:

INSERT INTO TABLE transaction_usa_canada
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_usa' AS source_table FROM transaction_usa
UNION ALL
SELECT tran_id, acct_id, tran_date, amount, description, branch_code, tran_state, tran_city, speendby, tran_zip, 'transaction_canada' AS source_table FROM transaction_canada;

希望这对你有帮助!!!

于 2016-05-18T12:50:25.090 回答
0

你也可以很好地做到这manual partitioning一点。

CREATE TABLE transaction_new_table (
tran_id int,
acct_id int,
tran_date string,
amount double,
description string,
branch_code string,
tran_state string,
tran_city string,
speendby string,
tran_zip int
)
PARTITIONED BY (sourcetablename String)

然后运行下面的命令,

load data inpath 'hdfspath' into table transaction_new_table   partition(sourcetablename='1')
于 2016-05-18T12:23:39.240 回答
0

您可以使用 Hive 的 INSERT INTO 子句

INSERT INTO TABLE table transaction_usa_canada 
SELECT tran_id, acct_id, tran_date, ...'transaction_usa' FROM transaction_usa;

INSERT INTO TABLE table transaction_usa_canada 
SELECT tran_id, acct_id, tran_date, ...'transaction_canada' FROM transaction_canada;
于 2016-05-18T12:31:13.503 回答