1

我在多个表上聚集了数据,通常它们看起来像这样:

CREATE TABLE 2012_03_09 (
    guid_key integer,
    property_key integer,
    instance_id_key integer,
    time_stamp timestamp without time zone,
    "value" double precision
)

使用这些索引:

CREATE INDEX 2012_03_09_a
  ON 2012_03_09
  USING btree
  (guid_key, property_key, time_stamp);

CREATE INDEX 2012_03_09_b
  ON 2012_03_09
  USING btree
  (time_stamp, property_key);

当我分析我的查询时,追加操作的总时间困扰着我。你能解释一下,为什么查询运行时间过长?有什么办法可以优化这样的查询吗?

Sort  (cost=262.50..262.61 rows=47 width=20) (actual time=1918.237..1918.246 rows=100 loops=1)    
  Output: 2012_04_26.time_stamp, 2012_04_26.value, 2012_04_26.instance_id_key    
  Sort Key: 2012_04_26.instance_id_key, 2012_04_26.time_stamp    
  Sort Method:  quicksort  Memory: 32kB    
  ->  Append  (cost=0.00..261.19 rows=47 width=20) (actual time=69.817..1917.848 rows=100 loops=1)    
        ->  Index Scan using 2012_04_26_a on 2012_04_26  (cost=0.00..8.28 rows=1 width=20) (actual time=14.909..14.909 rows=0 loops=1)    
              Output: 2012_04_26.time_stamp, 2012_04_26.value, 2012_04_26.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_04_27_a on 2012_04_27  (cost=0.00..8.28 rows=1 width=20) (actual time=1.535..1.535 rows=0 loops=1)    
              Output: 2012_04_27.time_stamp, 2012_04_27.value, 2012_04_27.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_02_a on 2012_05_02  (cost=0.00..12.50 rows=2 width=20) (actual time=53.370..121.894 rows=6 loops=1)    
              Output: 2012_05_02.time_stamp, 2012_05_02.value, 2012_05_02.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_03_a on 2012_05_03  (cost=0.00..24.74 rows=5 width=20) (actual time=59.136..170.215 rows=11 loops=1)    
              Output: 2012_05_03.time_stamp, 2012_05_03.value, 2012_05_03.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_04_a on 2012_05_04  (cost=0.00..12.47 rows=2 width=20) (actual time=67.458..125.172 rows=5 loops=1)    
              Output: 2012_05_04.time_stamp, 2012_05_04.value, 2012_05_04.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_05_a on 2012_05_05  (cost=0.00..8.28 rows=1 width=20) (actual time=14.112..14.112 rows=0 loops=1)    
              Output: 2012_05_05.time_stamp, 2012_05_05.value, 2012_05_05.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_07_a on 2012_05_07  (cost=0.00..12.46 rows=2 width=20) (actual time=60.549..99.999 rows=4 loops=1)    
              Output: 2012_05_07.time_stamp, 2012_05_07.value, 2012_05_07.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_08_a on 2012_05_08  (cost=0.00..24.71 rows=5 width=20) (actual time=63.367..197.296 rows=12 loops=1)    
              Output: 2012_05_08.time_stamp, 2012_05_08.value, 2012_05_08.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_09_a on 2012_05_09  (cost=0.00..28.87 rows=6 width=20) (actual time=59.596..224.685 rows=15 loops=1)    
              Output: 2012_05_09.time_stamp, 2012_05_09.value, 2012_05_09.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_10_a on 2012_05_10  (cost=0.00..28.85 rows=6 width=20) (actual time=56.995..196.590 rows=13 loops=1)    
              Output: 2012_05_10.time_stamp, 2012_05_10.value, 2012_05_10.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_11_a on 2012_05_11  (cost=0.00..20.59 rows=4 width=20) (actual time=62.761..134.313 rows=8 loops=1)    
              Output: 2012_05_11.time_stamp, 2012_05_11.value, 2012_05_11.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_12_a on 2012_05_12  (cost=0.00..8.28 rows=1 width=20) (actual time=12.018..12.018 rows=0 loops=1)    
              Output: 2012_05_12.time_stamp, 2012_05_12.value, 2012_05_12.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_13_a on 2012_05_13  (cost=0.00..8.28 rows=1 width=20) (actual time=12.286..12.286 rows=0 loops=1)    
              Output: 2012_05_13.time_stamp, 2012_05_13.value, 2012_05_13.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_14_a on 2012_05_14  (cost=0.00..16.58 rows=3 width=20) (actual time=92.161..156.802 rows=6 loops=1)    
              Output: 2012_05_14.time_stamp, 2012_05_14.value, 2012_05_14.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_15_a on 2012_05_15  (cost=0.00..25.03 rows=5 width=20) (actual time=73.636..263.537 rows=14 loops=1)    
              Output: 2012_05_15.time_stamp, 2012_05_15.value, 2012_05_15.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
        ->  Index Scan using 2012_05_16_a on 2012_05_16  (cost=0.00..12.56 rows=2 width=20) (actual time=100.893..172.404 rows=6 loops=1)    
              Output: 2012_05_16.time_stamp, 2012_05_16.value, 2012_05_16.instance_id_key    
              Index Cond: ((guid_key = 2105) AND (property_key = 67) AND (time_stamp >= '2012-04-16 00:00:00'::timestamp without time zone) AND (time_stamp <= '2012-05-16 06:25:50.172'::timestamp without time zone))    
Total runtime: 1918.745 ms

更新:

还发布 SQL 查询:

select time_stamp, value, instance_id_key as segment from perf_hourly_2012_04_26 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_04_27 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_02 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_03 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_04 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_05 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_07 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_08 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_09 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_10 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_11 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_12 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_13 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_14 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_15 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
UNION ALL
select time_stamp, value, instance_id_key as segment from 2012_05_16 where guid_key = 2105 and property_key=67 and time_stamp between '2012-04-16 00:00:00.0'::timestamp without time zone and '2012-05-16 06:25:50.172'::timestamp without time zone
ORDER BY 3 ASC, 1 ASC 
4

3 回答 3

2

看起来你应该检查Postgresql Partitioning。您的查询会更简单,并且性能可能会更好(不是 100% 肯定,但我认为值得一试)

于 2012-05-16T15:12:42.690 回答
2

除了追加之外,所有行似乎都是第一种类型的索引扫描。我不得不怀疑这是否是最好的索引。由于您似乎选择了重要的时间范围,因此唯一的其他选择是 guid_key 和 property_key。哪个更有选择性?更具选择性的列应该放在第一位(也就是说,如果您不担心排序,我认为您不应该对 100 行进行排序)其次,您是否为此查询或其他查询添加了这些索引?它可能如果它们在其他任何地方都没有用,那么丢弃它们是有意义的。索引实际上会降低性能,特别是如果表记录大部分时间已经在内存中,因为它们可能需要数据库从内存中卸载记录以加载索引(然后在完成后重新加载表记录与索引扫描)。

我能给出的唯一真正的建议就是玩它

编辑:

(当然还有其他问题,为什么这些记录没有某种主键,并且我忽略的表本身没有/没有集群,但它们在这里也起作用。)

于 2012-05-16T16:02:45.630 回答
1

UNION 不是您的计时问题,它报告的经过时间基本上是每个分区的索引扫描时间的总和。您的 _a 索引看起来对您的查询谓词具有适当的选择性。我在解释分析中看到的实时罪魁祸首是,在每个分区上进行索引扫描,只检索几行需要很长时间。例如:2012_05_04 5 行 125 毫秒。索引扫描可能会调用 0-5 次查找,具体取决于缓存状态和表大小,如果数据未聚集,则每个数据行将进行一次查找。一个慢速的单轴磁盘应该能够在大约 10 毫秒内进行寻道和块获取,因此对于糟糕的存储系统进行扫描的最坏情况是大约 100 毫秒,但对于更常见的 7200 或 10K rpm 磁盘和多轴,最差假设没有缓存命中的情况应低于 50 毫秒。

此查询在第一次之后立即在第二次尝试时运行得更快吗?如果是这样,这表明存储速度较慢,冷缓存是问题所在。数据库运行在什么样的存储上?如果我们谈论的是缓慢的笔记本电脑驱动器或高延迟的网络挂载,那就可以解释糟糕的 IO 性能。索引扫描也可能受到极端索引膨胀的影响。如果您有数十或数百个死索引条目,因为使用不正确的真空方案对数据进行更新/删除搅动,那么这可能是罪魁祸首。这些表是否定期清空和分析?

正如 Adrian Serafin 所建议的,您应该研究 Pg 的表分区功能。

于 2012-05-17T18:18:14.127 回答