I get a strange error with Ansible. First of all, the first role works fine but when Ansible tries to execute the seconde one it failed because of ssh error.
Environment:
- OS: CentOS 7
- Ansible version: 2.2.1.0
- Python version: 2.7.5
- OpenSSH version: OpenSSH_6.6.1p1, OpenSSL 1.0.1e-fips 11 Feb 2013
Ansible command which is executed:
ansible-playbook -vvvv -i inventory/dev playbook_update_system.yml --limit "db[0]"
Playbook:
- name: "HUB Playbook | Updating system packages on {{ ansible_hostname }}"
hosts: release_first_half
roles:
- upgrade_system_package
- reboot_server
Role: upgrade_system_package:
- name: "upgrading CentOS system packages on {{ ansible_hostname }}"
shell: sudo puppet apply -e 'exec{"upgrade-package":command => "/usr/bin/yum clean all; /usr/bin/yum -y update;"}'
when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names
Role: reboot_server:
- name: "reboot CentOS [{{ ansible_hostname }}] server"
shell: sudo puppet apply -e 'exec{"reboot-os":command => "/usr/sbin/reboot"}'
when: ansible_distribution == 'CentOS' and 'cassandra' not in group_names
Current behavior:
- Connection to "db1" node and execute role "upgrade system packages" => OK
- Try to connect to "db1" and execute role "reboot_server" => failed due to ssh.
Error message returned by Ansible:
fatal: [db1]: UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_6.6.1, OpenSSL 1.0.1e-fips 11 Feb 2013\r\ndebug1: Reading configuration data /USR/newtprod/.ssh/config\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 56: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug2: mux_client_hello_exchange: master version 4\r\ndebug3: mux_client_forwards: request forwardings: 0 local, 0 remote\r\ndebug3: mux_client_request_session: entering\r\ndebug3: mux_client_request_alive: entering\r\ndebug3: mux_client_request_alive: done pid = 64994\r\ndebug3: mux_client_request_session: session request sent\r\ndebug1: mux_client_request_session: master session id: 2\r\ndebug3: mux_client_read_packet: read header failed: Broken pipe\r\ndebug2: Control master terminated unexpectedly\r\nShared connection to db1 closed.\r\n", "unreachable": true }
I don't understand because the previous role has been executed successfully on this node. Moreover, we have a lot of playbook which are using same inventory file and they works fine. I tried on another node too but same result.
It's a simple and pretty well-known issue: the shutdown process causes SSH daemon to quit and this breaks the current SSH session (you get the "broken pipe" error). The server reboots properly, but Ansible flow gets interrupted.
You need to add a delay to your
shell
command and run it withasync
option, so that Ansible's SSH session can finish before it gets killed.