This question already has an answer here:
I'm attempting a loop script, and each loop run in parallel if possible.
#!/bin/bash
for ip in $(cat ./IPs); do
ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log
done
Ultimately, I'd like to place whatever I need in the loop, and have it multi-process. I've attempted to work through the other examples, but just can't seem to get it to work as expected. I have parallel
installed as well if that's an option.
This seems well-suited to
parallel
:The command run by
parallel
can use the received line for the redirection as well as in arguments.I've quoted the use of
{}
within the command - this shouldn't be necessary with an input file under your own control containing just hostnames and/or addresses, but is a good habit to get into, for when the input might contain characters significant to the shell.Note that by default, parallel will run only one job per core (which is a good choice for compute-heavy loads). For tasks such as this, where most time is spent waiting on network latency, I recommend you use a
-j
argument (e.g.-j 1000%
, to run ten jobs per core).For a simple version of this -
The
&
on the end puts it in background and lets the loop run the next iteration while that one is processing. As an added point, I also redirected the stderr of each to the same log (2>&1
) so they wouldn't get lost of something failed.I also switched to a
while read
to avoid needing thecat
in thefor
, but that's mostly stylistic preference.For a more load-aware version, use
wait
.I made a implistic control file that just has a letter per line -
Then declared a couple of values - a max I want it to fire at once, and a counter.
Then I typed in a read loop to iterate over the values, and run a set at a time. Until it accumulates the stated max, it keeps adding processes in background and counting them. Once it gets enough, it waits for those to finish and resets the counter before continuing with another set.
Here's the output.
And when finished, the last few are still running because I didn't go to the trouble of writing all this to an actual script with meticulous checking.
This does cause burst loads that (for tasks that don't all finish at about the same time) will dwindle till the last is done, causing spikes and lulls. With a little more finesse we could write a
waitpid
trap that would fire a new job each time one finished to keep the load steady, but that's an exercise for another day unless someone just really wants to see it. (I did it in Perl before, and have kind of always wanted to implement it in bash just because...)Obviously, as presented in other posts, you could just use
parallel
... but as an exercise, here's one way you could set a number of process chains that would read from a queue. I opted for simple callback rather than dealing with a SIGCHLD trap because there are a lot of little subprocs flying around...Refinements welcome if anyone cares.
Yes, there are security concerns with reading possibly dirty data and sourcing it. I wanted to keep the framework simple as an exercise. Suggestions are still welcome.
I threw together a quick command file with some complex commands built of simple crap just as examples.
Note the first one even runs itself in background - the spooler doesn't care. Job a will start b before
spool
can, so it will skip to c.Some of the logs -
(skipping ahead a bit...)
It works. It isn't perfect, but it could be handy. :)