我正在尝试调试一个生产速度慢但在我的开发机器上速度快的查询。我的开发箱有一个只有几天前的 prod 数据库的快照,所以两个数据库的内容大致相同。
查询是:
select count(*) from big_table where search_column in ('something')
笔记:
big_table
是一个快照物化视图,大约有 3500 万行,每天刷新一次search_column
有一个 b 树索引。- 产品在 ubuntu 上是 9.1
- 在 OS X 上 dev 是 9.0
查询计划
结果explain analyze
:
产品:
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1119843.20..1119843.21 rows=1 width=0) (actual time=467388.276..467388.278 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=10432.55..1118804.45 rows=415497 width=0) (actual time=116891.126..466949.331 rows=210053 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..10328.68 rows=415497 width=0) (actual time=8467.901..8467.901 rows=337164 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 467389.534 ms
(6 rows)
开发:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=524011.38..524011.39 rows=1 width=0) (actual time=209.852..209.852 rows=1 loops=1)
-> Bitmap Heap Scan on big_table (cost=5131.43..523531.22 rows=192064 width=0) (actual time=33.792..194.730 rows=209551 loops=1)
Recheck Cond: ((search_column)::text = 'something'::text)
-> Bitmap Index Scan on big_table_search_column_index (cost=0.00..5083.42 rows=192064 width=0) (actual time=27.568..27.568 rows=209551 loops=1)
Index Cond: ((search_column)::text = 'something'::text)
Total runtime: 209.938 ms
(6 rows)
prod 和 dev 两个查询的实际结果分别为 210053 和 209551 行。
尽管这两个计划的结构相同,但考虑到每个 DB 中该表的行数大致相同,如何解释上述成本的差异?
膨胀
根据@bma 的建议,以下是 prod 和 dev 的“膨胀”查询以及相关表/索引的结果:
产品:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 1.6 | 7965433856 | big_table_search_column_index | 0.1 | 0
开发:
current_database | schemaname | tablename | tbloat | wastedbytes | iname | ibloat | wastedibytes
------------------+------------+---------------------------------+--------+-------------+---------------------------------------------------------------+--------+--------------
my_db | public | big_table | 0.8 | 0 | big_table_search_column_index | 0.1 | 0
瞧,这里有区别。
我已经运行vacuum analyze big_table;
了,但这似乎与计数查询的运行时间没有任何显着差异。
配置
SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override');
bma 建议的结果:
产品:
name | current_setting | source
----------------------------+----------------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 6GB | configuration file
external_pid_file | /var/run/postgresql/9.1-main.pid | configuration file
listen_addresses | * | configuration file
log_line_prefix | %t | configuration file
log_timezone | localtime | environment variable
max_connections | 100 | configuration file
max_stack_depth | 2MB | environment variable
port | 5432 | configuration file
shared_buffers | 2GB | configuration file
ssl | on | configuration file
TimeZone | localtime | environment variable
unix_socket_directory | /var/run/postgresql | configuration file
(15 rows)
开发:
name | current_setting | source
----------------------------+-------------------------+----------------------
application_name | psql | client
DateStyle | ISO, MDY | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 4GB | configuration file
lc_messages | en_US | configuration file
lc_monetary | en_US | configuration file
lc_numeric | en_US | configuration file
lc_time | en_US | configuration file
listen_addresses | * | configuration file
log_destination | syslog | configuration file
log_directory | ../var | configuration file
log_filename | postgresql-%Y-%m-%d.log | configuration file
log_line_prefix | %t | configuration file
log_statement | all | configuration file
log_timezone | Australia/Hobart | command line
logging_collector | on | configuration file
maintenance_work_mem | 512MB | configuration file
max_connections | 50 | configuration file
max_stack_depth | 2MB | environment variable
shared_buffers | 2GB | configuration file
ssl | off | configuration file
synchronous_commit | off | configuration file
TimeZone | Australia/Hobart | command line
timezone_abbreviations | Default | command line
work_mem | 100MB | configuration file
(25 rows)