Run a looped process in bash across multiple cores

2019-03-30 19:29发布

I have a shell script that contains the following loop.

i=0  
upperlimit=$verylargevariable  
do  
   complexstuff RunManager file $i  
   i= 'expr $i +1'  
done

This script runs on a quad core machine, and according to top, uses about 15% of each core while executing one iteration of the loop. I'd like to distribute it across the four cores so that each iteration of the loop does complexstuff four times, one on each core, so the resources will be used more efficiently. We're talking about computation that currently takes several hours so efficiency is more than just good practice here. (The output of each iteration is obviously independent of the previous one.)

PS: Host is a server running Cent-OS, if that helps.

2条回答
ら.Afraid
2楼-- · 2019-03-30 19:40

With GNU Parallel you can do:

seq $verylargevariable | parallel -j150% complexstuff RunManager file

The 150% will run 1.5 process per core so if it currently uses 15% this should give you around 100% on all 4 cores.

To learn more watch the intro videos: http://www.youtube.com/watch?v=OpaiGYxkSuQ

查看更多
Juvenile、少年°
3楼-- · 2019-03-30 19:48

Apart the Ole Tange solution (that looks great), if your computations have pretty similar durations, you can try something like this :

i=0  
upperlimit=$verylargevariable  
do  
   complexstuff RunManager file $i &
   i= 'expr $i + 1'
   complexstuff RunManager file $i &
   i= 'expr $i + 1'
   complexstuff RunManager file $i &
   i= 'expr $i + 1'
   complexstuff RunManager file $i &
   i= 'expr $i + 1'
   wait
done

This way, on each run of the loop, you will create 4 bash subprocesses that will launch your computations (and as system is great, it will dispatch them on the different cores). If with 4 processes it is not enough to burn all your cpus, raise the number of processes created on each loop.

查看更多
登录 后发表回答