这里发生的extract
是使用date_part
函数实现的:
regress=> explain select count(1) from generate_series(1376143200000,1376143200000+1000000) x where x > extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-11 00:00:00+10')*1000 and x < extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-12 00:00:00+10')*1000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=30.02..30.03 rows=1 width=0)
-> Function Scan on generate_series x (cost=0.00..30.00 rows=5 width=0)
Filter: (((x)::double precision > (date_part('epoch'::text, '2013-08-10 22:00:00+08'::timestamp with time zone) * 1000::double precision)) AND ((x)::double precision < (date_part('epoch'::text, '2013-08-11 22:00:00+08'::timestamp with time zone) * 1000::double precision)))
(3 rows)
date_part(text, timestamptz)
被定义stable
为非immutable
:
regress=> \df+ date_part
List of functions
Schema | Name | Result data type | Argument data types | Type | Volatility | Owner | Language | Source code | Description
------------+-----------+------------------+-----------------------------------+--------+------------+----------+----------+--------------------------------------------------------------------------+---------------------------------------------
...
pg_catalog | date_part | double precision | text, timestamp with time zone | normal | stable | postgres | internal | timestamptz_part | extract field from timestamp with time zone
...
我很确定这会阻止 Pg 预先计算值并将其内联到调用中。我需要深入挖掘才能确定。
我相信推理是date_part
on atimestamptz
可以依赖于设置的值TimeZone
。这不是真的,date_part('epoch', some_timestamptz)
但查询规划器在规划时并不了解您正在使用它。
正如文档所述,我仍然对它没有预先计算感到惊讶:
STABLE
函数不能修改数据库,并且保证在给定单个语句中所有行的相同参数的情况下返回相同的结果。此类别允许优化器将函数的多次调用优化为单个调用。
您可以先将AT TIME ZONE 'UTC'
. 例如:
select count(1)
from generate_series(1376143200000,1376143200000+1000000) x
where x > extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-11 00:00:00+10' AT TIME ZONE 'UTC')*1000
and x < extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-12 00:00:00+10' AT TIME ZONE 'UTC')*1000;
这执行得更快,尽管如果只计算一次,时间差比我预期的要多:
regress=> select count(1) from generate_series(1376143200000,1376143200000+1000000) x where x > extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-11 00:00:00+10')*1000 and x < extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-12 00:00:00+10')*1000;
count
---------
1000000
(1 row)
Time: 767.629 ms
regress=> select count(1) from generate_series(1376143200000,1376143200000+1000000) x where x > extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-11 00:00:00+10' AT TIME ZONE 'UTC')*1000 and x < extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-12 00:00:00+10' AT TIME ZONE 'UTC')*1000;
count
---------
1000000
(1 row)
Time: 373.453 ms
regress=> select count(1) from generate_series(1376143200000,1376143200000+1000000) x where x > 1376143200000 and x < 1376229600000;
count
---------
1000000
(1 row)
Time: 324.557 ms
可以删除此查询优化器限制/添加一个功能来优化它。优化器可能需要在解析时识别这extract('epoch', ...)
是一种特殊情况,而不是调用不可变date_part('epoch, ...)
的特殊timestamptz_epoch(...)
函数。
稍微看一下perf top
结果表明,timestamptz 案例具有以下峰值:
10.33% postgres [.] ExecMakeFunctionResultNoSets
7.76% postgres [.] timesub.isra.1
6.94% postgres [.] datebsearch
5.58% postgres [.] timestamptz_part
3.82% postgres [.] AllocSetAlloc
2.97% postgres [.] ExecEvalConst
2.68% postgres [.] downcase_truncate_identifier
2.38% postgres [.] ExecEvalScalarVarFast
2.23% postgres [.] slot_getattr
1.99% postgres [.] DatumGetFloat8
而使用AT TIME ZONE
我们得到:
11.58% postgres [.] ExecMakeFunctionResultNoSets
4.28% postgres [.] AllocSetAlloc
4.18% postgres [.] ExecProject
3.82% postgres [.] slot_getattr
2.99% libc-2.17.so [.] __memmove_ssse3
2.96% postgres [.] BufFileWrite
2.80% libc-2.17.so [.] __memcpy_ssse3_back
2.74% postgres [.] BufFileRead
2.69% postgres [.] float8lt
并使用整数情况:
7.92% postgres [.] ExecMakeFunctionResultNoSets
5.36% postgres [.] slot_getattr
4.52% postgres [.] AllocSetAlloc
4.02% postgres [.] ExecProject
3.42% libc-2.17.so [.] __memmove_ssse3
3.33% postgres [.] BufFileWrite
3.31% libc-2.17.so [.] __memcpy_ssse3_back
2.91% postgres [.] BufFileRead
2.90% postgres [.] GetMemoryChunkSpace
2.67% postgres [.] AllocSetFree
所以你可以看到该AT TIME ZONE
版本避免了重复timestamptz_part
和datebsearch
调用。它与整数大小写的主要区别是float8lt
;看起来我们正在做double precision
比较而不是整数比较。
果然,演员会照顾它:
select count(1)
from generate_series(1376143200000,1376143200000+1000000) x
where x > extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-11 00:00:00+10' AT TIME ZONE 'UTC')::bigint * 1000
and x < extract(EPOCH FROM TIMESTAMP WITH TIME ZONE '2013-08-12 00:00:00+10' AT TIME ZONE 'UTC')::bigint * 1000;
目前我没有时间对上面讨论的优化器进行增强,但您可能需要考虑在邮件列表中提出。