编辑:根据我的一些调试和日志记录,我认为问题归结为为什么比where is just a single IDDELETE FROM table WHERE id = x
要快得多。DELETE FROM table WHERE id IN (x)
x
我最近测试了批量删除与逐行删除的对比,发现批量删除要慢得多。该表有删除、更新和插入的触发器,但我已经测试了有和没有触发器的情况,并且每次批量删除都比较慢。任何人都可以阐明为什么会这样或分享有关如何调试此问题的提示吗?据我了解,我不能真正减少触发器激活的次数,但我最初认为降低“删除”查询的次数将有助于提高性能。
我在下面包含了一些信息,如果我遗漏了任何相关信息,请告诉我。
删除以 10,000 个为单位进行,代码如下所示:
private void batchDeletion( Collection<Long> ids ) {
StringBuilder sb = new StringBuilder();
sb.append( "DELETE FROM ObjImpl WHERE id IN (:ids)" );
Query sql = getSession().createQuery( sb.toString() );
sql.setParameterList( "ids", ids );
sql.executeUpdate();
}
只删除一行的代码基本上是:
SessionFactory.getCurrentSession().delete(obj);
该表有两个索引,没有在任何删除中使用。不会发生级联操作。
以下是 EXPLAIN ANALYZE 的示例DELETE FROM table where id IN ( 1, 2, 3 );
:
Delete on table (cost=12.82..24.68 rows=3 width=6) (actual time=0.143..0.143 rows=0 loops=1)
-> Bitmap Heap Scan on table (cost=12.82..24.68 rows=3 width=6) (actual time=0.138..0.138 rows=0 loops=1)
Recheck Cond: (id = ANY ('{1,2,3}'::bigint[]))
-> Bitmap Index Scan on pk_table (cost=0.00..12.82 rows=3 width=0) (actual time=0.114..0.114 rows=0 loops=1)
Index Cond: (id = ANY ('{1,2,3}'::bigint[]))
Total runtime: 3.926 ms
每次我重新加载数据进行测试时,我都会清理并重新索引,我的测试数据包含 386,660 行。
测试是删除所有行,我没有使用TRUNCATE
,因为通常有一个选择标准,但出于测试目的,我已将标准包括所有行。启用触发器后,逐行删除需要 193,616 毫秒,而批量删除需要 285,558 毫秒。然后我禁用了触发器,单行删除时间为 93,793 毫秒,批量删除时间为 181,537 毫秒。触发器会汇总值并更新另一个表 - 基本上是簿记。
我玩过较小的批量大小(100 和 1),它们似乎都表现更差。
编辑:打开休眠日志记录和逐行删除,它基本上是在做:delete from table where id=?
和解释分析:
Delete on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.042..0.042 rows=0 loops=1)
-> Index Scan using pk_table on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.037..0.037 rows=0 loops=1)
Index Cond: (id = 3874904)
Total runtime: 0.130 ms
编辑:很好奇列表是否真的包含 10,000 个 ID,如果 Postgres 会做一些不同的事情:不。
Delete on table (cost=6842.01..138509.15 rows=9872 width=6) (actual time=17.170..17.170 rows=0 loops=1)
-> Bitmap Heap Scan on table (cost=6842.01..138509.15 rows=9872 width=6) (actual time=17.160..17.160 rows=0 loops=1)
Recheck Cond: (id = ANY ('{NUMBERS 1 THROUGH 10,000}'::bigint[]))
-> Bitmap Index Scan on pk_table (cost=0.00..6839.54 rows=9872 width=0) (actual time=17.139..17.139 rows=0 loops=1)
Index Cond: (id = ANY ('{NUMBERS 1 THROUGH 10,000}'::bigint[]))
Total runtime: 17.391 ms
编辑:根据上面的解释分析,我从实际的删除操作中检索了一些日志记录。下面是逐行删除的两个变体的记录。
以下是一些单一的删除:
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
2013-03-14 13:09:25,424:delete from table where id=?
这是单个删除的另一种变体(列表只有一项)
2013-03-14 13:49:59,858:delete from table where id in (?)
2013-03-14 13:50:01,460:delete from table where id in (?)
2013-03-14 13:50:03,040:delete from table where id in (?)
2013-03-14 13:50:04,544:delete from table where id in (?)
2013-03-14 13:50:06,125:delete from table where id in (?)
2013-03-14 13:50:07,707:delete from table where id in (?)
2013-03-14 13:50:09,275:delete from table where id in (?)
2013-03-14 13:50:10,833:delete from table where id in (?)
2013-03-14 13:50:12,369:delete from table where id in (?)
2013-03-14 13:50:13,873:delete from table where id in (?)
两者都是表中存在的 ID,应该是连续的。
解释分析DELETE FROM table WHERE id = 3774887;
Delete on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.097..0.097 rows=0 loops=1)
-> Index Scan using pk_table on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.055..0.058 rows=1 loops=1)
Index Cond: (id = 3774887)
Total runtime: 0.162 ms
解释分析DELETE FROM table WHERE id IN (3774887);
Delete on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.279..0.279 rows=0 loops=1)
-> Index Scan using pk_table on table (cost=0.00..8.31 rows=1 width=6) (actual time=0.210..0.213 rows=1 loops=1)
Index Cond: (id = 3774887)
Total runtime: 0.452 ms
0.162 vs 0.452 是否认为有显着差异?
编辑:
将批量大小设置为 50,000,Hibernate 不喜欢这个想法:
java.lang.StackOverflowError
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:40)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:41)
at org.hibernate.hql.ast.util.NodeTraverser.visitDepthFirst(NodeTraverser.java:42)
....