How to wait for server restart using Ansible?

2019-01-10 08:08发布

I'm trying to restart the server and then wait, using this:

- name: Restart server
  shell: reboot

- name: Wait for server to restart
  wait_for:
    port=22
    delay=1
    timeout=300

But I get this error:

TASK: [iptables | Wait for server to restart] ********************************* 
fatal: [example.com] => failed to transfer file to /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for:
sftp> put /tmp/tmpApPR8k /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for

Connected to example.com.
Connection closed

10条回答
狗以群分
2楼-- · 2019-01-10 08:37

2018 Update

As of 2.3, Ansible now ships with the wait_for_connection module, which can be used for exactly this purpose.

#
## Reboot
#

- name: (reboot) Reboot triggered
  command: /sbin/shutdown -r +1 "Ansible-triggered Reboot"
  async: 0
  poll: 0

- name: (reboot) Wait for server to restart
  wait_for_connection:
    delay: 75

The shutdown -r +1 prevents a return code of 1 to be returned and have ansible fail the task. The shutdown is run as an async task, so we have to delay the wait_for_connection task at least 60 seconds. 75 gives us a buffer for those snowflake cases.

wait_for_connection - Waits until remote system is reachable/usable

查看更多
放荡不羁爱自由
3楼-- · 2019-01-10 08:37

Through trial and error + a lot of reading this is what ultimately worked for me using the 2.0 version of Ansible:

$ ansible --version
ansible 2.0.0 (devel 974b69d236) last updated 2015/09/01 13:37:26 (GMT -400)
  lib/ansible/modules/core: (detached HEAD bbcfb1092a) last updated 2015/09/01 13:37:29 (GMT -400)
  lib/ansible/modules/extras: (detached HEAD b8803306d1) last updated 2015/09/01 13:37:29 (GMT -400)
  config file = /Users/sammingolelli/projects/git_repos/devops/ansible/playbooks/test-2/ansible.cfg
  configured module search path = None

My solution for disabling SELinux and rebooting a node when needed:

---
- name: disable SELinux
  selinux: state=disabled
  register: st

- name: reboot if SELinux changed
  shell: shutdown -r now "Ansible updates triggered"
  async: 0
  poll: 0
  ignore_errors: true
  when: st.changed

- name: waiting for server to reboot
  wait_for: host="{{ ansible_ssh_host | default(inventory_hostname) }}" port={{ ansible_ssh_port | default(22) }} search_regex=OpenSSH delay=30 timeout=120
  connection: local
  sudo: false
  when: st.changed

# vim:ft=ansible:
查看更多
Root(大扎)
4楼-- · 2019-01-10 08:40

Most reliable I've with 1.9.4 got is (this is updated, original version is at the bottom):

- name: Example ansible play that requires reboot
  sudo: yes
  gather_facts: no
  hosts:
    - myhosts
  tasks:
    - name: example task that requires reboot
      yum: name=* state=latest
      notify: reboot sequence
  handlers:
    - name: reboot sequence
      changed_when: "true"
      debug: msg='trigger machine reboot sequence'
      notify:
        - get current time
        - reboot system
        - waiting for server to come back
        - verify a reboot was actually initiated
    - name: get current time
      command: /bin/date +%s
      register: before_reboot
      sudo: false
    - name: reboot system
      shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
      async: 1
      poll: 0
      ignore_errors: true
    - name: waiting for server to come back
      local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=220
      sudo: false
    - name: verify a reboot was actually initiated
      # machine should have started after it has been rebooted
      shell: (( `date +%s` - `awk -F . '{print $1}' /proc/uptime` > {{ before_reboot.stdout }} ))
      sudo: false

Note the async option. 1.8 and 2.0 may live with 0 but 1.9 wants it 1. The above also checks if machine has actually been rebooted. This is good because once I had a typo that failed reboot and no indication of the failure.

The big issue is waiting for machine to be up. This version just sits there for 330 seconds and never tries to access host earlier. Some other answers suggest using port 22. This is good if both of these are true:

  • you have direct access to the machines
  • your machine is accessible immediately after port 22 is open

These are not always true so I decided to waste 5 minutes compute time.. I hope ansible extend the wait_for module to actually check host state to avoid wasting time.

btw the answer suggesting to use handlers is nice. +1 for handlers from me (and I updated answer to use handlers).

Here's original version but it it not so good and not so reliable:

- name: Reboot
  sudo: yes
  gather_facts: no
  hosts:
    - OSEv3:children
  tasks:
    - name: get current uptime
      shell: cat /proc/uptime | awk -F . '{print $1}'
      register: uptime
      sudo: false
    - name: reboot system
      shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
      async: 1
      poll: 0
      ignore_errors: true
    - name: waiting for server to come back
      local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=300
      sudo: false
    - name: verify a reboot was actually initiated
      # uptime after reboot should be smaller than before reboot
      shell: (( `cat /proc/uptime | awk -F . '{print $1}'` < {{ uptime.stdout }} ))
      sudo: false
查看更多
男人必须洒脱
5楼-- · 2019-01-10 08:41

I've created a reboot_server ansible role that can get dynamically called from other roles with:

- name: Reboot server if needed
  include_role:
    name: reboot_server
  vars:
    reboot_force: false

The role content is:

- name: Check if server restart is necessary
  stat:
    path: /var/run/reboot-required
  register: reboot_required

- name: Debug reboot_required
  debug: var=reboot_required

- name: Restart if it is needed
  shell: |
    sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
  async: 1
  poll: 0
  ignore_errors: true
  when: reboot_required.stat.exists == true
  register: reboot
  become: true

- name: Force Restart
  shell: |
    sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
  async: 1
  poll: 0
  ignore_errors: true
  when: reboot_force|default(false)|bool
  register: forced_reboot
  become: true

# # Debug reboot execution
# - name: Debug reboot var
#   debug: var=reboot

# - name: Debug forced_reboot var
#   debug: var=forced_reboot

# Don't assume the inventory_hostname is resolvable and delay 10 seconds at start
- name: Wait 300 seconds for port 22 to become open and contain "OpenSSH"
  wait_for:
    port: 22
    host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}'
    search_regex: OpenSSH
    delay: 10
  connection: local
  when: reboot.changed or forced_reboot.changed

This was originally designed to work with Ubuntu OS.

查看更多
Lonely孤独者°
6楼-- · 2019-01-10 08:47

In case you don't have DNS setup for the remote server yet, you can pass the IP address instead of a variable hostname:

- name: Restart server
  command: shutdown -r now

- name: Wait for server to restart successfully
  local_action:
    module: wait_for
      host={{ ansible_default_ipv4.address }}
      port=22
      delay=1
      timeout=120

These are the two tasks I added to the end of my ansible-swap playbook (to install 4GB of swap on new Digital Ocean droplets.

查看更多
The star\"
7楼-- · 2019-01-10 08:48
- wait_for:
    port: 22
    host: "{{ inventory_hostname }}"
  delegate_to: 127.0.0.1
查看更多
登录 后发表回答