1

所以我有这些我正在使用的特定列: customer_tokenmerchant_idmerchant_category_codetransaction_amount

我目前的查询是这样的:

SELECT customer_token, COUNT(transaction_amount), SUM(transaction_amount)
FROM transaction 
                     WHERE file_date>20121031 
                     and file_date<20121201
GROUP BY customer_token

我想在上面的查询中添加一个部分,在结果中,merchant_category_code 根据每个特定的交易金额分为不同的列merchant_category_code。结果如下所示:

customer_token、count(transaction_amount)、sum(transaction_amount)、count(merchant_category_code中的transaction_amount排名第1)、count(merchant_category_code中的transaction_amount排名第2)、count(merchant_category_code中的transaction_amount排名第3)等...

然后这个:

customer_token、count(transaction_amount)、sum(transaction_amount)、sum(merchant_category_code中的transaction_amount排名第1)、sum(merchant_category_code中的transaction_amount排名第2)、sum(merchant_category_code中的transaction_amount排名第3)等...

但我不知道如何做到这一点,或者是否有可能。

4

1 回答 1

2

如果您事先知道可能的值merchant_category_code是什么,则可以使用CASE表达式:

SELECT customer_token,
       COUNT(transaction_amount),
       SUM(transaction_amount),
       COUNT(CASE WHEN merchant_category_code = 1 THEN transaction_amount END),
       COUNT(CASE WHEN merchant_category_code = 2 THEN transaction_amount END),
       COUNT(CASE WHEN merchant_category_code = 3 THEN transaction_amount END),
       ...
       SUM(CASE WHEN merchant_category_code = 1 THEN transaction_amount END),
       SUM(CASE WHEN merchant_category_code = 2 THEN transaction_amount END),
       SUM(CASE WHEN merchant_category_code = 3 THEN transaction_amount END),
       ...
  FROM transaction 
 WHERE file_date BETWEEN 20121101 AND 20121130
 GROUP
    BY customer_token
;

(或IF表达式,如果您愿意;有关这两者的文档,请参阅Hive wiki 中“LanguageManual+UDF”页面上标题为“条件函数”的部分)。

于 2012-12-10T03:26:35.273 回答