我建议只获取一次表记录。
您的内部聚合计算可以使用窗口函数来完成。
我相信这个查询应该给你相同的结果,你会摆脱JOIN
.
SELECT
EMP_TYPE,
DEPT,
COUNT( DISTINCT EMP_ID ) OVER ( PARTITION BY EMP_TYPE ) AS TOTAL_COUNT,
COUNT( DISTINCT EMP_ID ) AS COUNT_DEPT
FROM
STAGE_SOURCE
GROUP BY EMP_TYPE, DEPT
请记住,aGROUP BY
也可以利用索引。
这是有关窗口和分析功能的 Apache Hive 手册的链接
#在评论后编辑
至少在窗口函数聚合计算之后应用 inPostgreSQL
子句DISTINCT
,导致我们进行一些利用,这可能会为您提供所需的东西。这样我们就摆脱了GROUP BY
. 看看它在 Postgres 上是如何工作的:SQLFiddle
试试下面的查询:
SELECT
DISTINCT
EMP_TYPE,
DEPT,
COUNT( DISTINCT EMP_ID ) OVER ( PARTITION BY EMP_TYPE ) AS TOTAL_COUNT,
COUNT( DISTINCT EMP_ID ) OVER ( PARTITION BY EMP_TYPE, DEPT ) AS COUNT_DEPT
FROM
STAGE_SOURCE
#编辑 2
SELECT
DISTINCT
EMP_TYPE,
DEPT,
COUNT( DISTINCT EMP_ID ) OVER ( PARTITION BY EMP_TYPE ) AS TOTAL_COUNT,
COUNT( DISTINCT EMP_ID ) OVER ( PARTITION BY EMP_TYPE, DEPT ) AS COUNT_DEPT
FROM (
SELECT DISTINCT EMP_TYPE, DEPT, EMP_ID FROM STAGE_SOURCE
) foo