0

I'm switching statistics from MySQL to Amazon DynamoDB and Elastic MapReduce.

I have query bellow that works with MySQL and I have the same table on hive and need the same results as on MySQL (product views for last_week, last_month and last_year).

SELECT product_id,
SELECT COUNT(product_id) from dev_product_views_hive as P2 where P2.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 WEEK) as weekly,
SELECT count(product_id) from dev_product_views_hive as P3 where P3.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 MONTH) as monthly,
SELECT count(product_id) from dev_product_views_hive as P4 where P4.product_id=P.product_id and created >= DATE_SUB(NOW(), INTERVAL 1 YEAR) as yearly
from dev_product_views_hive as P group by product_id;

I figured out how to get results for example for last month with hive:

SELECT product_id, COUNT(product_id) as views from dev_product_views_hive WHERE created >= UNIX_TIMESTAMP(CONCAT(DATE_SUB(FROM_UNIXTIME(UNIX_TIMESTAMP()), 31)," ","00:00:00")) GROUP BY product_id;

but i need grouped results like I get with MySql:

product_id views_last_week views_last_month views_last_year
2                 564             2460         29967
4                 980             3986         54982  

Is it possible to do this with hive?

Thank you in advance,

Amer

4

1 回答 1

1

您可以使用case when andsum()count()

例如。

select product_id, 
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 7)," 00:00:00") then 1 else 0 end)  as weekly,
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 31)," 00:00:00") then 1 else 0 end) as monthly,
sum(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 365)," 00:00:00") then 1 else 0 end) as yearly
from dev_product_views_hive 
group by product_id;

concat(date_sub(to_date(from_unixtime(unix_timestamp())), days)," 00:00:00")将返回当前时间过去几天的格式化字符串。

case when将在创建>=您期望的日期时重新​​运行 1

您也可以使用 hive 内置函数来执行此操作,该函数count()仅计算那些返回非 NULL 的行

count(case when created >= concat(date_sub(to_date(from_unixtime(unix_timestamp())), 7)," 00:00:00") then 1 end)  as weekly
于 2013-03-18T11:47:54.577 回答