I need a bash script to run some jobs in the background, three jobs at a time.
I know can do this in the following way, and for illustration, I will assume the number of jobs is 6:
./j1 &
./j2 &
./j3 &
wait
./j4 &
./j5 &
./j6 &
wait
However, this way, if, for example, j2 takes a lot longer to run that j1 and j3, then, I will be stuck with only one background job running for a long time.
The alternative (which is what I want) is that whenever one job is completed, bash should start the next job in the queue so that a rate of 3 jobs at any given time is maintained. Is it possible to write a bash script to implement this alternative, possibly using a loop? Please note that I need to run far more jobs, and I expect this alternative method to save me a lot of time.
Here is my draft of the script, which I hope you can help me to verify its correctness and improve it, as I'm new to scripting in bash. The ideas in this script are taken and modified from here, here, and here):
for i in $(seq 6)
do
# wait here if the number of jobs is 3 (or more)
while (( (( $(jobs -p | wc -l) )) >= 3 ))
do
sleep 5 # check again after 5 seconds
done
jobs -x ./j$i &
done
wait
IMHO, I think this script does the required behavior. However, I need to know -from bash experts- if I'm doing something wrong or if there is a better way of implementing this idea.
Thank you very much.
With GNU xargs:
printf '%s\0' j{1..6} | xargs -0 -n1 -P3 sh -c './"$1"' _
With bash (4.x) builtins:
max_jobs=3; cur_jobs=0
for ((i=0; i<6; i++)); do
# If true, wait until the next background job finishes to continue.
((cur_jobs >= max_jobs)) && wait -n
# Increment the current number of jobs running.
./j"$i" & ((++cur_jobs))
done
wait
Note that the approach relying on builtins has some corner cases -- if you have multiple jobs exiting at the exact same time, a single wait -n
can reap several of them, thus effectively consuming multiple slots. If we wanted to be more robust, we might end up with something like the following:
max_jobs=3
declare -A cur_jobs=( ) # build an associative array w/ PIDs of jobs we started
for ((i=0; i<6; i++)); do
if (( ${#cur_jobs[@]} >= max_jobs )); then
wait -n # wait for at least one job to exit
# ...and then remove any jobs that aren't running from the table
for pid in "${!cur_jobs[@]}"; do
kill -0 "$pid" 2>/dev/null && unset cur_jobs[$pid]
done
fi
./j"$i" & cur_jobs[$!]=1
done
wait
...which is obviously a lot of work, and still has a minor race. Consider using xargs -P
instead. :)
Using GNU Parallel:
parallel -j3 ::: ./j{1..6}
Or if your shell does not do .. expansion (e.g. csh):
seq 6 | parallel -j3 ./j'{}'
If you think you cannot install GNU Parallel, please read http://oletange.blogspot.dk/2013/04/why-not-install-gnu-parallel.html and leave a comment on why you cannot install it.
Maybe this could assist..
Sample usecase: run 'sleep 20' 30 times, just as an example. It could be any job or another script. Our control logic is to keep checking whether "how many already fired?" is less than or equal to "max processes defined", inside a while loop. If not, fire one and if yes, sleep .5 seconds.
Script output: In the below snip, it is observed that now we have 30 'sleep 20' commands running in the background, as we configured max=30.
%_Host@User> ps -ef|grep 'sleep 20'|grep -v grep|wc -l
30
%_Host@User>
Change value of no. of jobs at runtime: Script has a param "max", which takes value from a file "max.txt"(max=$(cat max.txt)
) and then applies it in each iteration of the while loop. As seen below, we changed it to 45 and now we have 45 'sleep 20' commands running in the background. You can put the main script in background and just keep changing the max value inside "max.txt
" to control.
%_Host@User> cat > max.txt
45
^C
%_Host@User> ps -ef|grep 'sleep 20'|grep -v grep|wc -l
45
%_Host@User>
Script:
#!/bin/bash
#---------------------------------------------------------------------#
proc='sleep 20' # Your process or script or anything..
max=$(cat max.txt) # configure how many jobs do you want
curr=0
#---------------------------------------------------------------------#
while true
do
curr=$(ps -ef|grep "$proc"|grep -v grep|wc -l); max=$(cat max.txt)
while [[ $curr -lt $max ]]
do
${proc} & # Sending process to background.
max=$(cat max.txt) # After sending one job, again calculate max and curr
curr=$(ps -ef|grep "$proc"|grep -v grep|wc -l)
done
sleep .5 # sleep .5 seconds if reached max jobs.
done
#---------------------------------------------------------------------#
Let us know if it was any useful.