我正在使用游标从大型 postgres 表中检索记录。(4亿条记录,使用子表对数据进行分区。)我的游标定义为:
select * from parent_table order by indexed_column
同时使用 JDBC 和 psql,前几十万次检索的性能是一致的。之后,它从悬崖上掉下来,再也没有恢复过来。在服务器上 CPU、内存和磁盘活动相当均匀;即,没有任何基于系统的东西是明显的罪魁祸首。我最初怀疑这可能是网络问题,但我已经从不同的网络复制了这个问题。
这是 psql:
db@dbdev> fetch 100000 from all_persons;
Time: 13995.910 ms
db@dbdev> fetch 100000 from all_persons;
Time: 13852.955 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14037.631 ms
db@dbdev> fetch 100000 from all_persons;
Time: 13818.516 ms
db@dbdev> fetch 100000 from all_persons;
Time: 13952.260 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14257.836 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14115.941 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14375.485 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14898.741 ms
db@dbdev> fetch 100000 from all_persons;
Time: 14086.004 ms
db@dbdev> fetch 100000 from all_persons;
Time: 59841.556 ms
db@dbdev> fetch 100000 from all_persons;
Time: 198176.211 ms
db@dbdev> fetch 100000 from all_persons;
Time: 162593.582 ms
这是 JDBC(一次检索 10000 个;左边的数字是插入回的已过滤记录集的计数):
...
536040 retrieve in 405; filtering in 28; insert in 1734
544739 retrieve in 413; filtering in 27; insert in 1713
553574 retrieve in 382; filtering in 27; insert in 1761
563167 retrieve in 348; filtering in 28; insert in 2019
572723 retrieve in 363; filtering in 27; insert in 2048
581736 retrieve in 363; filtering in 28; insert in 1784
591131 retrieve in 480; filtering in 28; insert in 1869
600260 retrieve in 377; filtering in 27; insert in 1831
608234 retrieve in 24074; filtering in 27; insert in 1566
616212 retrieve in 23711; filtering in 27; insert in 1649
624449 retrieve in 25913; filtering in 27; insert in 1587
632528 retrieve in 29981; filtering in 27; insert in 1527
641334 retrieve in 23231; filtering in 27; insert in 1728
650427 retrieve in 27883; filtering in 27; insert in 1996
659516 retrieve in 34422; filtering in 27; insert in 1774
虽然 psql 性能似乎越来越差,但 JDBC 性能至少在一百万条记录中保持大致一致(在大约 34k 和 17k 毫秒之间波动)。
性能突然下降的原因是什么?
(编辑)工作解决方案:
我通过将批处理大小(检索/插入)降低到 5000 并按顺序对每个子表(而不是父表)运行游标来解决这个问题。我还从光标中删除了 order by,因为这似乎有帮助,即使 order by 是针对有序索引的。
我的猜测是,这给了 postgres 一次加载完整分区的最佳机会。