Sometimes, ansible
doesn't do what you want. And increasing verbosity doesn't help. For example, I'm now trying to start coturn
server, which comes with init script on systemd
OS (Debian Jessie). Ansible considers it running, but it's not. How do I look into what's happening under the hood? Which commands are executed, and what output/exit code?
相关问题
- Access ansible.cfg variable in task
- Pass custom debug information to Microsoft bot fra
- How do I identify what code is generating “ '&
- Unable to get “exclude” option working with unarch
- Monodevelop: `Waiting for debugger`
相关文章
- How do I get to see DbgPrint output from my kernel
- Advanced profiling is unavailable for the selected
- Can't Inspect Variables When Debugging .NET As
- What is the difference between glibc's MALLOC_
- Embedding a program's source code into its bin
- How to execute another python script from your scr
- How do I debug errors that have no error message?
- Any way in Visual Studio to not break on throwing
Debugging Ansible tasks can be almost impossible if the tasks are not your own. Contrary to what Ansible website states.
Ansible requires highly specialized programming skills because it is not YAML or Python, it is a messy mix of both.
The idea of using markup languages for programming has been tried before. XML was very popular in Java community at one time. XSLT is also a fine example.
As Ansible projects grow, the complexity grows exponentially as result. Take for example the OpenShift Ansible project which has the following task:
I think we can all agree that this is programming in YAML. Not a very good idea. This specific snippet could fail with a message like
If you hit a message like that you are doomed. But we have the debugger right? Okay, let's take a look what is going on.
How does that help? It doesn't.
The point here is that it is an incredibly bad idea to use YAML as a programming language. It is a mess. And the symptoms of the mess we are creating are everywhere.
Some additional facts. Provision of prerequisites phase on Azure of Openshift Ansible takes on +50 minutes. Deploy phase takes more than +70 minutes. Each time! First run or subsequent runs. And there is no way to limit provision to a single node. This
limit
problem was part of Ansible in 2012 and it is still part of Ansible today. This fact tells us something.The point here is that Ansible should be used as was intended. For simple tasks without the YAML programming. Fine for lots of servers but it should not be used for complex configuration management tasks.
Ansible is a not Infrastructure as Code ( IaC ) tool.
If you ask how to debug Ansible issues, you are using it in a way it was not intended to be used. Don't use it as a IaC tool.
Debugging roles/playbooks
Basically debugging ansible automation over big inventory across large networks is none the other than debugging a distributed network application. It can be very tedious and delicate, and there are not enough user friendly tools.
Thus I believe the also answer to your question is a union of all the answers before mine + small addition. So here:
absolutely mandatory: you have to want to know what's going on, i.e. what you're automating, what you are expecting to happen. e.g. ansible failing to detect service with systemd unit as running or as stopped usually means a bug in service unit file or service module, so you need to 1. identify the bug, 2. Report the bug to vendor/community, 3. Provide your workaround with TODO and link to bug. 4. When bug is fixed - delete your workaround
to make your code easier to debug use modules, as much as you can
give all tasks and variables meaningful names.
use static code analysis tools like
ansible-lint
. This saves you from really stupid small mistakes.utilize verbosity flags and log path
use
debug
module wisely"Know thy facts" - sometimes it is useful to dump target machine facts into file and pull it to ansible master
use
strategy: debug
in some cases you can fall into a task debugger at error. You then can eval all the params the task is using, and decide what to do nextthe last resort would be using Python debugger, attaching it to local ansible run and/or to remote Python executing the modules. This is usually tricky: you need to allow additional port on machine to be open, and if the code opening the port is the one causing the problem?
Also, sometimes it is useful to "look aside" - connect to your target hosts and increase their debuggability (more verbose logging)
Of course log collection makes it easier to track changes happening as a result of ansible operations.
As you can see, like any other distributed applications and frameworks - debug-ability is still not as we'd wish for.
Filters/plugins
This is basically Python development, debug as any Python app
Modules
Depending on technology, and complicated by the fact you need to see both what happens locally and remotely, you better choose language easy enough to debug remotely.
Debugging modules
The most basic way is to run
ansible
/ansible-playbook
with an increased verbosity level by adding-vvv
to the execution line.The most thorough way for the modules written in Python (Linux/Unix) is to run
ansible
/ansible-playbook
with an environment variableANSIBLE_KEEP_REMOTE_FILES
set to1
(on the control machine).It causes Ansible to leave the exact copy of the Python scripts it executed (either successfully or not) on the target machine.
The path to the scripts is printed in the Ansible log and for regular tasks they are stored under the SSH user's home directory:
~/.ansible/tmp/
.The exact logic is embedded in the scripts and depends on each module. Some are using Python with standard or external libraries, some are calling external commands.
Debugging playbooks
Similarly to debugging modules increasing verbosity level with
-vvv
parameter causes more data to be printed to the Ansible logSince Ansible 2.1 a Playbook Debugger allows to debug interactively failed tasks: check, modify the data; re-run the task.
Debugging connections
-vvvv
parameter to theansible
/ansible-playbook
call causes the log to include the debugging information for the connections.Here's what I came up with.
Ansible sends modules to the target system and executes them there. Therefore, if you change module locally, your changes will take effect when running playbook. On my machine modules are at
/usr/lib/python2.7/site-packages/ansible/modules
(ansible-2.1.2.0
). Andservice
module is atcore/system/service.py
. Anisble modules (instances ofAnsibleModule
class declared atmodule_utils/basic.py
) haslog
method, which sends messages to systemd journal if available, or falls back tosyslog
. So, runjournalctl -f
on target system, add debug statements (module.log(msg='test')
) to module locally, and run your playbook. You'll see debug statements underansible-basic.py
unit name.Additionally, when you run
ansible-playbook
with-vvv
, you can see some debug output insystemd
journal, at least invocation messages, and error messages if any.One more thing, if you try to debug code that's running locally with
pdb
(import pdb; pdb.set_trace()
), you'll most likely run intoBdbQuit
exception. That's becausepython
closesstdin
when creating a thread (ansible
worker). The solution here is to reopenstdin
before runningpdb.set_trace()
as suggested here:You could use register module, and debug module to print the return values. For example, I want to know what is the return code of my script execution called "somescript.sh", so I will have my tasks inside the play such as:
For full return values you can access in Ansible, you can check this page: http://docs.ansible.com/ansible/latest/common_return_values.html
There are multiple levels of debugging that you might need but the easiest one is to add
ANSIBLE_STRATEGY=debug
environment variable, which will enable the debugger on the first error.