我在通过 SuperSet(apache 孵化器)运行 Hive 查询时遇到了一个奇怪的问题:
SELECT
date,
sum(1) visits,
sum(price) revenue
FROM
visits
WHERE
date BETWEEN '2017-07-21' AND '2017-07-25'
AND country = 'US'
GROUP BY
date,
browser
我得到的错误可以在我运行超集的终端中捕获(运行 ubuntu 的 VM 虚拟机虚拟机):
Traceback (most recent call last):
File "/home/userxx/venv/local/lib/python2.7/site-packages/superset/sql_lab.py", line 182, in execute_sql
db_engine_spec.handle_cursor(cursor, query, session)
File "/home/userxx/venv/local/lib/python2.7/site-packages/superset/db_engine_specs.py", line 726, in handle_cursor
resp = cursor.fetch_logs()
File "/home/userxx/venv/local/lib/python2.7/site-packages/superset/db_engines/hive.py", line 34, in fetch_logs
response.results.rows, 'expected data in columnar format'
AssertionError
有趣的是,当日期周期为 7/21 - 7/24 时,它可以正常工作。我认为它必须使用内存,但将浏览器添加到故事中(作为一个选项组)并没有改变行为(我的逻辑是添加它会破坏查询,即使在 7/21 - 7/24 期间由于行数增加)。
不用说,从例如SQL Developer Tool启动时,查询运行完美。
提前致谢!