1

After finding the right SQL query for my purposes, I realized that my query is slow.

WITH temp_table (t_col_1, t_col_2, t_col_3) AS
(
    SELECT col_1 AS t_col_1, col_2 AS t_col_2, col_3 AS t_col_3
    FROM actual_table
    WHERE ID = 100 AND PID = 1245
)
SELECT t_col_1, t_col_2, t_col_3 
FROM temp_table AS t1 
WHERE t1.t_col_2 BETWEEN 1 AND 12541
  AND t1.t_col_1 = (SELECT max(t2.t_col_1)
                    FROM temp_table AS t2
                    WHERE t2.t_col_1 < 15147
                      AND t2.t_col_2 = t1.t_col_2) 
ORDER BY t1.t_col_2

The reason, why I use the query in this form is as follows:

  1. The SQL query is generated and used within Matlab to fetch data.
  2. Depending on the ID, it can happen that columns col_1 and col_2 are interchanged which is why t_col_1 = col_2 and t_col_2 = col_1. In this case the Matlab script replaces col_1 AS t_col_2 and col_2 AS t_col_1.

Is there an elegant way to accelerate the query?

Thanks in advance.

4

2 回答 2

0

答案完全取决于您的查询优化器和数据库统计信息,这又会根据您选择的数据库而有所不同。

  1. 获取 QEP - 查询执行计划
  2. 查看计划缓慢的地方
  3. 优化查询和/或添加数据库统计信息和/或添加所需的索引

您可以尝试调整查询并可能会很幸运,但正确的方法是了解查询计划。

例如,您无法知道“max”是否很慢,或者 actual_table 可能有十亿行没有 ID 和 PID 索引。

于 2016-08-27T05:01:49.913 回答
0

t_col_1您的问题是,当您的主查询检查条件时,每行都在运行关于最大值的查询WHERE。相反,您可以max(t2.t_col_1)从运行一次的子查询中生成值,然后在您的条件中使用该变量,如下所示:

SELECT PID, t1.t_col_1, t1.t_col_2, t1.t_col_3
FROM
    (SELECT PID, t_col_1, t_col_2, t_col_3, max(t2.t_col_1) AS t_col_1_max
    FROM temp_table
    GROUP BY PID, t_col_1, t_col_2, t_col_3) 
    as t1
WHERE 
    (t1.t_col_2 BETWEEN 1 AND 12541)
    AND t1.t_col_1 < 15147
    AND t1.t_col_1 = t1.t_col_1_max
ORDER BY t1.t_col_2

您生成临时表的代码看起来不错。

于 2016-08-26T15:41:41.537 回答