1

我有一个执行器服务,它应该每分钟将一些东西写入磁盘。

它是这样安排的:

    scheduledCacheDump = new ScheduledThreadPoolExecutor(1);
    scheduledCacheDump.scheduleAtFixedRate(this::saveCachedRecords,
                                           60,
                                           60,
                                           TimeUnit.SECONDS
    );

该任务使用由主线程填充的共享列表,因此它在该列表上同步:

   private void saveCachedRecords() {
        LOG.info(String.format("Scheduled record dump to disk. We have %d records to save.", recordCache.size()));
        synchronized (recordCache) {
            Iterator<Record> iterator = recordCache.iterator();
            while (iterator.hasNext()) {
               // save record to disk
               iterator.remove();
            }
        }
    }

我的清单是这样声明的:

private final List<Record> recordCache = new ArrayList<>();

主线程批量接收数据,因此每隔一秒左右,它会接收到缓存在列表中的 30 条记录。其余时间它在套接字上等待。

我不明白的是,从日志中,我的计划任务通常在间隔超过 1 分钟的时间内触发:

sept. 16 09:30:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:31:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:32:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:33:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:34:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:35:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:42:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:43:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:44:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:45:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:46:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:55:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:56:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:57:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:58:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 09:59:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 10:04:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 10:05:43 Scheduled record dump to disk. We have 27 records to save. sept. 16 10:06:43 Scheduled record dump to disk. We have 27 records to save.

看这个:

  • 九月 16 09:59:43 预定的记录转储到磁盘。我们有 27 条记录要保存。
  • 九月 16 10:04:43 预定的记录转储到磁盘。我们有 27 条记录要保存。

=> 5 分钟

甚至:

  • 九月 16 09:46:43预定的记录转储到磁盘。我们有 27 条记录要保存。
  • 九月 16 09:55:43预定的记录转储到磁盘。我们有 27 条记录要保存。

=> 9 分钟

我的日志在synchronized()范围内,所以我不知道任务是否真的按时安排并在锁上等待10分钟,或者它是否只是一个真正的调度问题。我会将它移出它,但通常我无法理解线程如何在大约每秒释放的锁上保持阻塞 10 分钟。

我该如何调查这个?

有关信息:它运行的机器是 KVM 机器,这可能是一个因素吗?

4

1 回答 1

1

天啊..

这根本不是 Java 的错。这是systemd的错。

我的进程作为 systemd 服务运行,所以我提取的日志来自 systemd-journald。猜猜看,systemd 中有一个速率限制。当调度程序被触发时,我的守护进程会命中它,所以我得到了很多这样的行:

Suppressed 570 messages from /system.slice/xxx.service Suppressed 769 messages from /system.slice/xxx.service Suppressed 745 messages from /system.slice/xxx.service Suppressed 729 messages from /system.slice/xxx.service Suppressed 717 messages from /system.slice/xxx.service Suppressed 95 messages from /system.slice/xxx.service Suppressed 543 messages from /system.slice/xxx.service

所以..是的,我删除了 journald 的限制,现在我每分钟都有我的踪迹。

解决方案是:

  • 编辑/etc/systemd/journald.conf
  • 添加行:RateLimitInterval=0在最后
  • 执行systemctl restart systemd-journald
  • 执行systemctl restart myservice

现在我的一切都很顺利。

我会更新标题以供将来参考:)

于 2016-09-16T09:30:01.437 回答