44

我正在尝试重新启动服务器,然后等待,使用这个:

- name: Restart server
  shell: reboot

- name: Wait for server to restart
  wait_for:
    port=22
    delay=1
    timeout=300

但我得到这个错误:

TASK: [iptables | Wait for server to restart] ********************************* 
fatal: [example.com] => failed to transfer file to /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for:
sftp> put /tmp/tmpApPR8k /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for

Connected to example.com.
Connection closed
4

11 回答 11

61

Ansible >= 2.7 (released in Oct 2018)

Use the built-in reboot module:

- name: Wait for server to restart
  reboot:
    reboot_timeout: 3600

Ansible < 2.7

Restart as a task

- name: restart server
  shell: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
  async: 1
  poll: 0
  become: true

This runs the shell command as an asynchronous task, so Ansible will not wait for end of the command. Usually async param gives maximum time for the task but as poll is set to 0, Ansible will never poll if the command has finished - it will make this command a "fire and forget". Sleeps before and after shutdown are to prevent breaking the SSH connection during restart while Ansible is still connected to your remote host.

Wait as a task

You could just use:

- name: Wait for server to restart
  local_action:
    module: wait_for
      host={{ inventory_hostname }}
      port=22
      delay=10
    become: false

..but you may prefer to use {{ ansible_ssh_host }} variable as the hostname and/or {{ ansible_ssh_port }} as the SSH host and port if you use entries like:

hostname         ansible_ssh_host=some.other.name.com ansible_ssh_port=2222 

..in your inventory (Ansible hosts file).

This will run the wait_for task on the machine running Ansible. This task will wait for port 22 to become open on your remote host, starting after 10 seconds delay.

Restart and wait as handlers

But I suggest to use both of these as handlers, not tasks.

There are 2 main reason to do this:

  • code reuse - you can use a handler for many tasks. Example: trigger server restart after changing the timezone and after changing the kernel,

  • trigger only once - if you use a handler for a few tasks, and more than 1 of them will make some change => trigger the handler, then the thing that handler does will happen only once. Example: if you have a httpd restart handler attached to httpd config change and SSL certificate update, then in case both config and SSL certificate changes httpd will be restarted only once.

Read more about handlers here.

Restarting and waiting for the restart as handlers:

  handlers:

    - name: Restart server
      command: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
      async: 1
      poll: 0
      ignore_errors: true
      become: true

    - name: Wait for server to restart
      local_action:
        module: wait_for
          host={{ inventory_hostname }}
          port=22
          delay=10
        become: false

..and use it in your task in a sequence, like this, here paired with rebooting the server handler:

  tasks:
    - name: Set hostname
        hostname: name=somename
        notify:
          - Restart server
          - Wait for server to restart

Note that handlers are run in the order they are defined, not the order they are listed in notify!

于 2015-01-21T15:24:11.960 回答
37

您应该将 wait_for 任务更改为作为local_action运行,并指定您正在等待的主机。例如:

- name: Wait for server to restart
  local_action:
    module: wait_for
      host=192.168.50.4
      port=22
      delay=1
      timeout=300
于 2014-05-27T11:35:28.847 回答
10

我用 1.9.4 得到的最可靠的是(这是更新的,原始版本在底部):

- name: Example ansible play that requires reboot
  sudo: yes
  gather_facts: no
  hosts:
    - myhosts
  tasks:
    - name: example task that requires reboot
      yum: name=* state=latest
      notify: reboot sequence
  handlers:
    - name: reboot sequence
      changed_when: "true"
      debug: msg='trigger machine reboot sequence'
      notify:
        - get current time
        - reboot system
        - waiting for server to come back
        - verify a reboot was actually initiated
    - name: get current time
      command: /bin/date +%s
      register: before_reboot
      sudo: false
    - name: reboot system
      shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
      async: 1
      poll: 0
      ignore_errors: true
    - name: waiting for server to come back
      local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=220
      sudo: false
    - name: verify a reboot was actually initiated
      # machine should have started after it has been rebooted
      shell: (( `date +%s` - `awk -F . '{print $1}' /proc/uptime` > {{ before_reboot.stdout }} ))
      sudo: false

注意async选项。1.8 和 2.0 可以接受,0但 1.9 想要它1。上面还检查机器是否实际上已经重新启动。这很好,因为一旦我有一个错误导致重新启动失败并且没有失败的迹象。

最大的问题是等待机器启动。这个版本只是在那里停留了 330 秒,并且从不尝试更早地访问主机。其他一些答案建议使用端口 22。如果这两个都是真的,这很好:

  • 您可以直接访问机器
  • 打开 22 端口后可以立即访问您的机器

这些并不总是正确的,所以我决定浪费 5 分钟的计算时间。我希望 ansible 扩展 wait_for 模块以实际检查主机状态以避免浪费时间。

顺便说一句,建议使用处理程序的答案很好。为我的处理程序+1(我更新了使用处理程序的答案)。

这是原始版本,但它不太好且不太可靠:

- name: Reboot
  sudo: yes
  gather_facts: no
  hosts:
    - OSEv3:children
  tasks:
    - name: get current uptime
      shell: cat /proc/uptime | awk -F . '{print $1}'
      register: uptime
      sudo: false
    - name: reboot system
      shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
      async: 1
      poll: 0
      ignore_errors: true
    - name: waiting for server to come back
      local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=300
      sudo: false
    - name: verify a reboot was actually initiated
      # uptime after reboot should be smaller than before reboot
      shell: (( `cat /proc/uptime | awk -F . '{print $1}'` < {{ uptime.stdout }} ))
      sudo: false
于 2016-01-25T19:35:52.813 回答
8

2018 更新

从 2.3 开始,Ansible 现在随附该wait_for_connection模块,该模块可用于此目的。

#
## Reboot
#

- name: (reboot) Reboot triggered
  command: /sbin/shutdown -r +1 "Ansible-triggered Reboot"
  async: 0
  poll: 0

- name: (reboot) Wait for server to restart
  wait_for_connection:
    delay: 75

shutdown -r +1 可防止返回 1 的返回码并使任务失败。关闭是作为异步任务运行的,因此我们必须将wait_for_connection任务延迟至少 60 秒。75 为我们提供了那些雪花案例的缓冲。

wait_for_connection - 等待远程系统可访问/可用

于 2018-02-08T17:58:07.940 回答
6

我想评论 Shahar 的帖子,他使用硬编码的主机地址更好的是让它成为一个变量来引用当前主机 ansible 正在配置 {{ inventory_hostname }},所以他的代码将是这样的:

- name: Wait for server to restart
  local_action:
    module: wait_for
     host={{ inventory_hostname }}
     port=22
     delay=1
     timeout=300
于 2014-10-25T06:30:29.500 回答
5

对于较新版本的 Ansible(在我的例子中是 1.9.1),将 poll 和 async 参数设置为 0 有时是不够的(可能取决于设置了什么发行版 ansible ?)。如https://github.com/ansible/ansible/issues/10616中所述,一种解决方法是:

- name: Reboot
  shell: sleep 2 && shutdown -r now "Ansible updates triggered"
  async: 1
  poll: 0
  ignore_errors: true

然后,等待重启完成,如本页的许多答案中所述。

于 2015-07-26T21:29:53.423 回答
4

通过反复试验 + 大量阅读,这最终对我使用 2.0 版 Ansible 有效:

$ ansible --version
ansible 2.0.0 (devel 974b69d236) last updated 2015/09/01 13:37:26 (GMT -400)
  lib/ansible/modules/core: (detached HEAD bbcfb1092a) last updated 2015/09/01 13:37:29 (GMT -400)
  lib/ansible/modules/extras: (detached HEAD b8803306d1) last updated 2015/09/01 13:37:29 (GMT -400)
  config file = /Users/sammingolelli/projects/git_repos/devops/ansible/playbooks/test-2/ansible.cfg
  configured module search path = None

我在需要时禁用 SELinux 并重新启动节点的解决方案:

---
- name: disable SELinux
  selinux: state=disabled
  register: st

- name: reboot if SELinux changed
  shell: shutdown -r now "Ansible updates triggered"
  async: 0
  poll: 0
  ignore_errors: true
  when: st.changed

- name: waiting for server to reboot
  wait_for: host="{{ ansible_ssh_host | default(inventory_hostname) }}" port={{ ansible_ssh_port | default(22) }} search_regex=OpenSSH delay=30 timeout=120
  connection: local
  sudo: false
  when: st.changed

# vim:ft=ansible:
于 2015-09-03T18:51:53.457 回答
1
- wait_for:
    port: 22
    host: "{{ inventory_hostname }}"
  delegate_to: 127.0.0.1
于 2015-04-10T04:53:32.573 回答
0

我没有看到很多关于此的可见性,但最近的更改(https://github.com/ansible/ansible/pull/43857)添加了“ignore_unreachable”关键字。这使您可以执行以下操作:

- name: restart server
  shell: reboot
  ignore_unreachable: true

- name: wait for server to come back
  wait_for_connection: 
      timeout: 120

- name: the next action
  ...
于 2019-01-14T17:19:53.503 回答
0

我创建了一个 reboot_server ansible 角色,可以从其他角色动态调用:

- name: Reboot server if needed
  include_role:
    name: reboot_server
  vars:
    reboot_force: false

角色内容为:

- name: Check if server restart is necessary
  stat:
    path: /var/run/reboot-required
  register: reboot_required

- name: Debug reboot_required
  debug: var=reboot_required

- name: Restart if it is needed
  shell: |
    sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
  async: 1
  poll: 0
  ignore_errors: true
  when: reboot_required.stat.exists == true
  register: reboot
  become: true

- name: Force Restart
  shell: |
    sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
  async: 1
  poll: 0
  ignore_errors: true
  when: reboot_force|default(false)|bool
  register: forced_reboot
  become: true

# # Debug reboot execution
# - name: Debug reboot var
#   debug: var=reboot

# - name: Debug forced_reboot var
#   debug: var=forced_reboot

# Don't assume the inventory_hostname is resolvable and delay 10 seconds at start
- name: Wait 300 seconds for port 22 to become open and contain "OpenSSH"
  wait_for:
    port: 22
    host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}'
    search_regex: OpenSSH
    delay: 10
  connection: local
  when: reboot.changed or forced_reboot.changed

这最初是为与 Ubuntu 操作系统一起工作而设计的。

于 2018-12-04T14:28:14.283 回答
0

如果您还没有为远程服务器设置 DNS,您可以传递 IP 地址而不是变量主机名:

- name: Restart server
  command: shutdown -r now

- name: Wait for server to restart successfully
  local_action:
    module: wait_for
      host={{ ansible_default_ipv4.address }}
      port=22
      delay=1
      timeout=120

这是我添加到我的ansible-swap 剧本末尾的两个任务(在新的 Digital Ocean 液滴上安装 4GB 的交换。

于 2016-05-21T17:15:05.943 回答