There have been some similar questions, but my problem is not "run several programs in parallel" - which can be trivially done with parallel
or xargs
.
I need to parallelize Bash functions.
Let's imagine code like this:
for i in "${list[@]}"
do
for j in "${other[@]}"
do
# some processing in here - 20-30 lines of almost pure bash
done
done
Some of the processing requires calls to external programs.
I'd like to run some (4-10) tasks, each running for different $i
. Total number of elements in $list is > 500.
I know I can put the whole for j ... done
loop in external script, and just call this program in parallel, but is it possible to do without splitting the functionality between two separate programs?
Edit: Please consider Ole's answer instead.
Instead of a separate script, you can put your code in a separate bash function. You can then export it, and run it via xargs:
sem
is part of GNU Parallel and is made for this kind of situation.If you like the function better GNU Parallel can do the dual for loop in one go:
Solution to run multi-line commands in parallel:
In your case:
If there are 8 bash jobs already running,
wait
will wait for at least one job to complete. If/when there are less jobs, it starts new ones asynchronously.Benefits of this approach:
man
):bash
to work.Downsides:
wait
with fewer jobs than required. However, it will resume when at least one job completes, or immediately if there are 0 jobs running (wait -n
exits immediately in this case).&
) within the same bash script, you'll have fewer worker processes in the loop.