如何在 Hive 中进行子选择?我想我可能犯了一个对我来说并不那么明显的非常明显的错误......
我收到的错误:FAILED: Parse Error: line 4:8 cannot recognize input 'SELECT' in expression specification
这是我的三个源表:
aaa_hit -> [SESSION_KEY, HIT_KEY, URL]
aaa_event-> [SESSION_KEY,HIT_KEY,EVENT_ID]
aaa_session->[SESSION_KEY,REMOTE_ADDRESS]
...我想要做的是将结果插入到这样的结果表中:
result -> [url, num_url, event_id, num_event_id, remote_address, num_remote_address]
...其中第 1 列是 URL,第 3 列是每个 URL 的前 1 个“事件”,第 5 列是访问该 URL 的前 1 个 REMOTE_ADDRESS。(甚至列是前一列的“计数”。)
Sooooo ...我在这里做错了什么?
INSERT OVERWRITE TABLE result2
SELECT url,
COUNT(url) AS access_url,
(SELECT events.event_id as evt,
COUNT(events.event_id) as access_evt
FROM aaa_event events
LEFT OUTER JOIN aaa_hit hits
ON ( events.hit_key = hit_key )
ORDER BY access_evt DESC LIMIT 1),
(SELECT sessions.remote_address as remote_address,
COUNT(sessions.remote_address) as access_addr
FROM aaa_session sessions
RIGHT OUTER JOIN aaa_hit hits
ON ( sessions.session_key = session_key )
ORDER BY access_addr DESC LIMIT 1)
FROM aaa_hit
ORDER BY access_url DESC;
太感谢了 :)