I have a bunch of servers that need will be need frequent patching. I am planning on using Ansible to coordinate the patching process. The keep point here is that it must be an "all or nothing" patching. Either all servers are patched or none.
The tasks I was considering for my playbook would be something like: 1 - Go to all servers and take an lvm snapshot 2 - IIF task 1 works on all servers, apply the changes 3 - If one of the hosts fails for any reason, roll back the snapshot on ALL NODES.
The problem is that I am new to Ansible and I can't express this on a playbook. I have written this simple testing playbook:
---
- hosts: all
strategy: linear
tasks:
- block:
- debug: msg='Testing on {{ inventory_hostname }}...'
- command: /home/amirsamary/activity.sh
changed_when: false
rescue:
- debug: msg='Rollback of {{ inventory_hostname }}...'
- debug: msg='I continued running tasks on {{ inventory_hostname }}...'
I have two hosts on my inventory. On the first node, activity.sh returns true and on the second node, activity.sh returns false. So, node2 will always fail. The problem is that the rescue tasks will only run for the failed host and not for all of them (as one would expect anyway) and the playbook keeps running the other tasks.
I have heard a lot about how good Ansible was to orchestrate complex tasks on thousands of servers. But I can't seem to find a way of safely implement an "all or nothing strategy" with it. What am I missing?