我在我们的 Teradata QA 环境中遇到了一个问题,一个在 1 分钟内运行的简单查询现在需要 12 分钟才能完成。此选择基于简单的内部连接提取 5 个字段
select a.material
, b.season
, b.theme
, b.collection
from SalesOrders_view.Allocation_Deliveries_cur a
inner join SalesOrders_view.Material_Attributes_cur b
on a.material = b.material;
我可以在我们的 Prod 环境中运行同样的查询,它会在不到一分钟的时间内返回,同时运行的记录比 QA 多约 20 万条。
SalesOrders.Allocation_Deliveries 中的总交易量低于 110 万条记录,SalesOrders.Material_Attributes 中的总数量低于 129k 条记录。这些是小型数据集。
我比较了两种环境中的 Explain 计划,在第一个 Join 步骤中估计的 spool 体积存在明显差异。生产中的估计是在金钱上,而质量保证中的估计是一个数量级。然而,两个系统中的数据和表格/视图是相同的,我们以各种可能的方式收集了统计数据,我们可以看到两个系统中的特定表格人口统计数据是相同的。
最后,这个查询在包括 QA 在内的所有环境中总是在一分钟内返回,因为它仍在生产中进行。这种潜在的行为是最近一周左右的。我与我们的 DBA 讨论了这个问题,我们没有对软件或配置进行任何更改。他是新人,但似乎知道自己在做什么,但仍被新环境赶上。
我正在寻找一些关于下一步检查的指示。我已经比较了 QA 和 Prod 的相关表/视图定义,它们是相同的。每个系统中的表格人口统计数据也是相同的(我与我们的 DBA 一起检查了这些以确保)
任何帮助表示赞赏。提前致谢。拍
这是 QA 的解释计划。请注意第 5 步(144 行)中的非常低的估计值。在 Prod 中,相同的解释显示 > 1 M 行,这将接近我所知道的。
Explain select a.material
, b.season
, b.theme
, b.collection
from SalesOrders_view.Allocation_Deliveries a
inner join SalesOrders_view.Material_Attributes_cur b
on a.material = b.material;
1) First, we lock SalesOrders.Allocation_Deliveries in view
SalesOrders_view.Allocation_Deliveries for access, and we lock
SalesOrders.Material_Attributes in view SalesOrders_view.Material_Attributes_cur for
access.
2) Next, we do an all-AMPs SUM step to aggregate from
SalesOrders.Material_Attributes in view SalesOrders_view.Material_Attributes_cur by way
of an all-rows scan with no residual conditions
, grouping by field1 ( SalesOrders.Material_Attributes.material
,SalesOrders.Material_Attributes.season ,SalesOrders.Material_Attributes.theme
,SalesOrders.Material_Attributes.theme ,SalesOrders.Material_Attributes.af_grdval
,SalesOrders.Material_Attributes.af_stcat
,SalesOrders.Material_Attributes.Material_Attributes_SRC_SYS_NM). Aggregate
Intermediate Results are computed locally, then placed in Spool 4.
The size of Spool 4 is estimated with high confidence to be
129,144 rows (41,713,512 bytes). The estimated time for this step
is 0.06 seconds.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by
way of an all-rows scan into Spool 2 (all_amps), which is
redistributed by the hash code of (
SalesOrders.Material_Attributes.Field_9,
SalesOrders.Material_Attributes.Material_Attributes_SRC_SYS_NM,
SalesOrders.Material_Attributes.Field_7, SalesOrders.Material_Attributes.Field_6,
SalesOrders.Material_Attributes.theme, SalesOrders.Material_Attributes.theme,
SalesOrders.Material_Attributes.season, SalesOrders.Material_Attributes.material)
to all AMPs. Then we do a SORT to order Spool 2 by row hash
and the sort key in spool field1 eliminating duplicate rows.
The size of Spool 2 is estimated with low confidence to be
129,144 rows (23,504,208 bytes). The estimated time for this
step is 0.11 seconds.
2) We do an all-AMPs RETRIEVE step from SalesOrders.Material_Attributes in
view SalesOrders_view.Material_Attributes_cur by way of an all-rows scan
with no residual conditions locking for access into Spool 6
(all_amps), which is redistributed by the hash code of (
SalesOrders.Material_Attributes.material, SalesOrders.Material_Attributes.season,
SalesOrders.Material_Attributes.theme, SalesOrders.Material_Attributes.theme,
SalesOrders.Material_Attributes.Material_Attributes_SRC_SYS_NM,
SalesOrders.Material_Attributes.Material_Attributes_UPD_TS, (CASE WHEN (NOT
(SalesOrders.Material_Attributes.af_stcat IS NULL )) THEN
(SalesOrders.Material_Attributes.af_stcat) ELSE ('') END )(VARCHAR(16),
CHARACTER SET UNICODE, NOT CASESPECIFIC), (CASE WHEN (NOT
(SalesOrders.Material_Attributes.af_grdval IS NULL )) THEN
(SalesOrders.Material_Attributes.af_grdval) ELSE ('') END )(VARCHAR(8),
CHARACTER SET UNICODE, NOT CASESPECIFIC)) to all AMPs. Then
we do a SORT to order Spool 6 by row hash. The size of Spool
6 is estimated with high confidence to be 129,144 rows (
13,430,976 bytes). The estimated time for this step is 0.08
seconds.
4) We do an all-AMPs RETRIEVE step from Spool 2 (Last Use) by way of
an all-rows scan into Spool 7 (all_amps), which is built locally
on the AMPs. Then we do a SORT to order Spool 7 by the hash code
of (SalesOrders.Material_Attributes.material, SalesOrders.Material_Attributes.season,
SalesOrders.Material_Attributes.theme, SalesOrders.Material_Attributes.theme,
SalesOrders.Material_Attributes.Field_6, SalesOrders.Material_Attributes.Field_7,
SalesOrders.Material_Attributes.Material_Attributes_SRC_SYS_NM,
SalesOrders.Material_Attributes.Field_9). The size of Spool 7 is estimated
with low confidence to be 129,144 rows (13,301,832 bytes). The
estimated time for this step is 0.05 seconds.
5) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of an
all-rows scan, which is joined to Spool 7 (Last Use) by way of an
all-rows scan. Spool 6 and Spool 7 are joined using an inclusion
merge join, with a join condition of ("(material = material) AND
((season = season) AND ((theme = theme) AND ((theme =
theme) AND (((( CASE WHEN (NOT (af_grdval IS NULL )) THEN
(af_grdval) ELSE ('') END ))= Field_6) AND (((( CASE WHEN (NOT
(AF_STCAT IS NULL )) THEN (AF_STCAT) ELSE ('') END ))= Field_7)
AND ((Material_Attributes_SRC_SYS_NM = Material_Attributes_SRC_SYS_NM) AND
(Material_Attributes_UPD_TS = Field_9 )))))))"). The result goes into Spool
8 (all_amps), which is duplicated on all AMPs. The size of Spool
8 is estimated with low confidence to be 144 rows (5,616 bytes).
The estimated time for this step is 0.04 seconds.
6) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an
all-rows scan, which is joined to SalesOrders.Allocation_Deliveries in view
SalesOrders_view.Allocation_Deliveries by way of an all-rows scan with no
residual conditions. Spool 8 and SalesOrders.Allocation_Deliveries are
joined using a single partition hash_ join, with a join condition
of ("SalesOrders.Allocation_Deliveries.material = material"). The result goes
into Spool 1 (group_amps), which is built locally on the AMPs.
The size of Spool 1 is estimated with low confidence to be 3,858
rows (146,604 bytes). The estimated time for this step is 0.44
seconds.
7) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 1 are sent back to the user as the result of
statement 1. The total estimated time is 0.70 seconds.
这是记录分布的样子和我用来生成结果集的 SQL
SELECT HASHAMP(HASHBUCKET(HASHROW( MATERIAL ))) AS
"AMP#",COUNT(*)
FROM EDW_LND_SAP_VIEW.EMDMMU01_CUR
GROUP BY 1
ORDER BY 2 DESC;
输出最高:AMP 137,1093 行 最低:AMP 72,768 行 AMP 总数:144