我是 Apache Pig 的新手,正在努力学习。COUNT(DISTINCT CASE WHEN ...)
Apache Pig 中是否有等效的 SQL ?
例如,我正在尝试做这样的事情:
CREATE TABLE email_profile AS
SELECT user_id
, COUNT(DISTINCT CASE WHEN email_code = 'C' THEN message_id ELSE NULL END) AS clickthroughs
, COUNT(DISTINCT CASE WHEN email_code = 'O' THEN message_id ELSE NULL END) AS opened_messages
, COUNT(DISTINCT message_id) AS total_messages_received
FROM email_campaigns
GROUP BY user_id;
我不能使用 a FILTER email_campaigns BY email_code = 'C'
,因为这会减少其他情况。有没有办法在一个嵌套FOREACH
块中完成这一切?
谢谢!
编辑:
根据要求,示例数据。字段是used_id
、email_code
和message_id
。
user1@example.com O 111
user1@example.com C 111
user2@example.com O 111
user1@example.com O 222
user2@example.com O 333
预期输出:
user1@example.com 2 1 2
user2@example.com 2 0 2