0

我运行一个需要及时的数据库集群。可悲的是,有时我的 VM 主机正在将带有此类 DB 节点的 VM 移动到另一台主机,然后时间就少了一秒或更长时间。然后我的数据库节点关闭并由 systemd 重新启动。

我的 systemd 文件包含以下内容:

ExecStartPre=-+/usr/bin/chronyc -a makestep
ExecStart=/usr/local/bin/.......

我希望这会在这样的时间滞后关闭数据库后立即同步我的时间。但是由于我的日志,直到识别并修复差异需要 7 分钟。我的数据库在每次重新启动时检测到间隙并再次关闭。最后,我得到了这个 chronyd 日志:

Nov 16 10:25:51 dc3-sirius chronyd[164166]: System clock was stepped by 0.000020 seconds
Nov 16 10:26:07 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:23 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:39 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:26:55 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:11 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:27 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:43 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:27:59 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:15 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:31 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:47 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:28:59 dc3-sirius chronyd[164166]: Source 81.169.199.94 replaced with 212.71.244.243
Nov 16 10:29:03 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:29:19 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:29:32 dc3-sirius chronyd[164166]: Selected source 109.230.227.90
Nov 16 10:29:35 dc3-sirius chronyd[164166]: System clock was stepped by 0.003850 seconds
Nov 16 10:29:51 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:07 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:23 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:39 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:30:55 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:11 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:27 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:43 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:31:59 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:13 dc3-sirius chronyd[164166]: Can't synchronise: no majority
Nov 16 10:32:15 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:31 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:32:33 dc3-sirius chronyd[164166]: Selected source 109.230.227.90
Nov 16 10:32:33 dc3-sirius chronyd[164166]: System clock wrong by 1.101260 seconds, adjustment started
Nov 16 10:32:48 dc3-sirius chronyd[164166]: System clock was stepped by 1.003151 seconds
Nov 16 10:33:04 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:21 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:37 dc3-sirius chronyd[164166]: System clock was stepped by -0.000000 seconds
Nov 16 10:33:51 dc3-sirius chronyd[164166]: Selected source 162.159.200.123
Nov 16 10:33:53 dc3-sirius chronyd[164166]: System clock was stepped by 0.409613 seconds

如您所见,它在 >7 分钟后开始同步时钟:

我的数据库在 10:25:51 检测到该问题。由此,在每次数据库重新启动之前,多次执行上述命令以重新同步时钟。但它需要到 10:32:33 和 10:33:53 才能真正最终修复时钟。

知道如何强制时钟直接同步而不是几分钟后吗?

4

1 回答 1

1

我终于找到了一个解决方案来保持 chrony 并在时间滞后的情况下强制立即时间同步(由 DB 节点检测到)。解决方案是重新启动 chronyd 服务 - 模拟系统的重新启动。

我将数据库的 systemd 文件更改为如下所示:

ExecStartPre=-+systemctl restart chronyd
ExecStartPre=/bin/sleep 5
ExecStart=/usr/local/bin/cockroach start ...

/etc/chrony.conf文件中,我添加了以下行:

initstepslew 0.5 pool.ntp.org
makestep 0.5 -1

如果时间偏移大于 0.5 秒,这会强制 chronyd 在重新启动时重新同步时间。

这最终使我的系统直接重新同步,然后立即重新启动数据库节点。

您可以在此处查找有关 chrony.conf 选项的更多信息。

于 2022-03-01T13:12:03.217 回答