I have a simple fabfile by the name env_fabfile.py
# env_fabfile.py
# makes use of fab env variables
from fabric.api import env, run
def login():
env.hosts = ['user@host1:1234', 'user@host2:2345']
env.passwords = {'user@host1:1234': 'pass1', 'user@host2:2345': 'pass2'}
env.parallel=True
def run_lsb_release():
run('lsb_release -a')
Now I run the above using the fab command as :
fab -f env_fabfile.py login run_lsb_release
And it runs perfectly (in parallel) and gives the desired output
Now I wanted to actually calculate the time difference between when the same script is run in serial vs when it is run in parallel. So to do this I wrote the below python script: timecal.py
# timecal.py
# runs the fabfile once in serial and calculates the time
# then runs the same file in parallel and calculates the time
from fabric.api import env, run
import time
def login():
print "in login"
env.hosts = ['user@host1:1234', 'user@host2:2345']
env.passwords = {'user@host1:1234': 'pass1', 'user@host2:2345': 'pass2'}
def parallel(status):
print "in parallel"
env.parallel=status
def run_lsb_release():
print "in run"
run('lsb_release -a')
def do_serial():
start_time = time.time()
parallel(False)
login()
run_lsb_release()
elapsed_time = time.time() - start_time
return elapsed_time
def do_parallel():
start_time = time.time()
parallel(True)
login()
run_lsb_release()
elapsed_time = time.time() - start_time
return elapsed_time
if __name__ == '__main__':
print "Running in serial mode "
print "Total time taken ", do_serial()
print "Running in parallel mode"
print "Total time taken ", do_parallel()
but when I run timecal.py as
python timecal.py
I get the below on stdout (apart from the print statements in the code)
No hosts found. Please specify (single) host string for connection:
I don't understand why ? Also how can the script be rectified so that I can achieve what I want to (as stated in the question above)
In case I try a different version of timecal.py, as :
from fabric.api import run, settings, env
import time
def do_parallel():
start_time = time.time()
env.hosts = ['user@host1:1234', 'user@host2:2345']
env.passwords = {'user@host1:1234': 'pass1', 'user@host2:2345': 'pass2'}
env.parallel=True
run('lsb_release -a')
elapsed_time = time.time() - start_time
return elapsed_time
def do_serial():
start_time = time.time()
with settings(host_string='host1', port=1234, user='user', password='pass1'):
run('lsb_release -a')
with settings(host_string='host2', port=2345, user='user', password='pass2'):
run('lsb_release -a')
elapsed_time = time.time() - start_time
return elapsed_time
if __name__ == '__main__':
try:
print "Running in parallel mode"
print "Total time taken ", do_parallel()
print "Running in serial mode "
print "Total time taken ", do_serial()
except Exception as e:
print e
I get the below error :
Fatal error: Needed to prompt for the target host connection string (host: None), but input would be ambiguous in parallel mode
I don't understand why is the host: None here ? What is wrong with the code ?
The short answer is that you shouldn't set the
env.hosts
value the way you are currently doing, andenv.passowrds
is super-sketchy (broken maybe?), and it's recommended to use SSH key-based access, especially leveraging native SSH config files.Here's the modified version of your timecal.py script which works as expected, and I'll call out some of the differences below.
The main difference is using
env.roledefs
, and the SSH config file, rather than hosts & passwords. Those values will NOT work in the parallel execution mode, due to the fact that those tasks are executed in separate threads. The docs are a little thin, but that's basically why you're having this problem.