for loop bash scripts parallel [duplicate]

2019-08-23 09:19发布

This question already has an answer here:

I'm attempting a loop script, and each loop run in parallel if possible.

#!/bin/bash

for ip in $(cat ./IPs); do
ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log
done

Ultimately, I'd like to place whatever I need in the loop, and have it multi-process. I've attempted to work through the other examples, but just can't seem to get it to work as expected. I have parallel installed as well if that's an option.

2条回答
Melony?
2楼-- · 2019-08-23 10:07

This seems well-suited to parallel:

parallel 'ping -n -c 2 -W 1 "{}" >>"Results/{}.log"' <IPs

The command run by parallel can use the received line for the redirection as well as in arguments.

I've quoted the use of {} within the command - this shouldn't be necessary with an input file under your own control containing just hostnames and/or addresses, but is a good habit to get into, for when the input might contain characters significant to the shell.

Note that by default, parallel will run only one job per core (which is a good choice for compute-heavy loads). For tasks such as this, where most time is spent waiting on network latency, I recommend you use a -j argument (e.g. -j 1000%, to run ten jobs per core).

查看更多
Luminary・发光体
3楼-- · 2019-08-23 10:15

For a simple version of this -

while read ip
do  ping -n -c 2 -W 1 ${ip} >> Results/${ip}.log 2>&1 &
done < IPs

The & on the end puts it in background and lets the loop run the next iteration while that one is processing. As an added point, I also redirected the stderr of each to the same log (2>&1) so they wouldn't get lost of something failed.

$: ls x a # x exists, a doesn't
ls: cannot access 'a': No such file or directory
x
$: ls x a > log # send stdout to log, but error still goes to console
ls: cannot access 'a': No such file or directory
$: cat log # log only has success message
x
$: ls x a > log 2>&1 # send stderr where stdout is going - to same log
$: cat log # now both messages in the log
ls: cannot access 'a': No such file or directory
x

I also switched to a while read to avoid needing the cat in the for, but that's mostly stylistic preference.

For a more load-aware version, use wait.

I made a implistic control file that just has a letter per line -

$: cat x
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z

Then declared a couple of values - a max I want it to fire at once, and a counter.

$: declare -i cnt=0 max=10

Then I typed in a read loop to iterate over the values, and run a set at a time. Until it accumulates the stated max, it keeps adding processes in background and counting them. Once it gets enough, it waits for those to finish and resets the counter before continuing with another set.

$: while read ctl             # these would be your IP's
> do if (( cnt++ < max ))     # this checks for max load
>    then echo starting $ctl  # report which we're doing
>         date                # throw a timestamp
>         sleep 10 &          # and fire the task in background
>    else echo letting that batch work... # when too many running
>         cnt=0               # reset the counter
>         wait                # and thumb-twiddle till they all finish
>         echo continuing     # log
>         date                # and timestamp
>    fi
> done < x                    # the whole loop reads from x until done

Here's the output.

starting a
Thu, Oct 25, 2018  8:13:34 AM
[1] 10436
starting b
Thu, Oct 25, 2018  8:13:34 AM
[2] 7544
starting c
Thu, Oct 25, 2018  8:13:34 AM
[3] 10296
starting d
Thu, Oct 25, 2018  8:13:34 AM
[4] 6244
starting e
Thu, Oct 25, 2018  8:13:34 AM
[5] 8560
starting f
Thu, Oct 25, 2018  8:13:35 AM
[6] 8824
starting g
Thu, Oct 25, 2018  8:13:35 AM
[7] 11640
starting h
Thu, Oct 25, 2018  8:13:35 AM
[8] 9856
starting i
Thu, Oct 25, 2018  8:13:35 AM
[9] 7612
starting j
Thu, Oct 25, 2018  8:13:35 AM
[10] 9100
letting that batch work...
[1]   Done                    sleep 10
[2]   Done                    sleep 10
[3]   Done                    sleep 10
[4]   Done                    sleep 10
[5]   Done                    sleep 10
[6]   Done                    sleep 10
[7]   Done                    sleep 10
[8]   Done                    sleep 10
[9]-  Done                    sleep 10
[10]+  Done                    sleep 10
continuing
Thu, Oct 25, 2018  8:13:45 AM
starting l
Thu, Oct 25, 2018  8:13:45 AM
[1] 8600
starting m
Thu, Oct 25, 2018  8:13:45 AM
[2] 516
starting n
Thu, Oct 25, 2018  8:13:45 AM
[3] 3296
starting o
Thu, Oct 25, 2018  8:13:45 AM
[4] 8608
starting p
Thu, Oct 25, 2018  8:13:46 AM
[5] 4040
starting q
Thu, Oct 25, 2018  8:13:46 AM
[6] 7476
starting r
Thu, Oct 25, 2018  8:13:46 AM
[7] 4468
starting s
Thu, Oct 25, 2018  8:13:46 AM
[8] 4144
starting t
Thu, Oct 25, 2018  8:13:46 AM
[9] 8956
starting u
Thu, Oct 25, 2018  8:13:46 AM
[10] 6864
letting that batch work...
[1]   Done                    sleep 10
[2]   Done                    sleep 10
[3]   Done                    sleep 10
[4]   Done                    sleep 10
[5]   Done                    sleep 10
[6]   Done                    sleep 10
[7]   Done                    sleep 10
[8]   Done                    sleep 10
[9]-  Done                    sleep 10
[10]+  Done                    sleep 10
continuing
Thu, Oct 25, 2018  8:13:56 AM
starting w
Thu, Oct 25, 2018  8:13:56 AM
[1] 5520
starting x
Thu, Oct 25, 2018  8:13:56 AM
[2] 6436
starting y
Thu, Oct 25, 2018  8:13:57 AM
[3] 12216
starting z
Thu, Oct 25, 2018  8:13:57 AM
[4] 8468

And when finished, the last few are still running because I didn't go to the trouble of writing all this to an actual script with meticulous checking.

$: ps
      PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
    11012   10944   11012      11040  pty0     2136995 07:59:35 /usr/bin/bash
     6436   11012    6436       9188  pty0     2136995 08:13:56 /usr/bin/sleep
     5520   11012    5520      10064  pty0     2136995 08:13:56 /usr/bin/sleep
    12216   11012   12216      12064  pty0     2136995 08:13:57 /usr/bin/sleep
     8468   11012    8468      10100  pty0     2136995 08:13:57 /usr/bin/sleep
     9096   11012    9096      10356  pty0     2136995 08:14:03 /usr/bin/ps

This does cause burst loads that (for tasks that don't all finish at about the same time) will dwindle till the last is done, causing spikes and lulls. With a little more finesse we could write a waitpid trap that would fire a new job each time one finished to keep the load steady, but that's an exercise for another day unless someone just really wants to see it. (I did it in Perl before, and have kind of always wanted to implement it in bash just because...)

Because it was requested -

Obviously, as presented in other posts, you could just use parallel... but as an exercise, here's one way you could set a number of process chains that would read from a queue. I opted for simple callback rather than dealing with a SIGCHLD trap because there are a lot of little subprocs flying around...

Refinements welcome if anyone cares.

#! /bin/env bash

trap 'echo abort $0@$LINENO; die; exit 1' ERR       # make sure any error is fatal
declare -i primer=0          # a countdown of how many processes to pre-spawn
use="
  $0 <#procs> <cmdfile>

  Pass the number of desired processes to prespawn as the 1st argument.
  Pass the command file with the list of tasks you need done.

  Command file format:
   KEYSTRING:cmdlist

  where KEYSTRING will be used as a unique logfile name
  and   cmdlist   is the base command string to be run

"

die() {
   echo "$use" >&2
   return 1
}

case $# in
2) primer=$1
   case "$primer" in
   *[^0-9]*) echo "INVALID #procs '$primer'"
             die;;
   esac
   cmdfile=$2
   [[ -r "$cmdfile" ]] || die
   declare -i lines=$( grep -c . $cmdfile)
   if (( lines < primer ))
   then echo "Note - command lines in $cmdfile ($lines) fewer than requested process chains ($primer)"
        die
   fi ;;
*) die ;;
esac >&2

trap ': no-op to ignore' HUP  # ignore hangups (built-in nohup without explicit i/o redirection)

spawn() {
  IFS="$IFS:" read key cmd || return
  echo "$(date) executing '$cmd'; c.f. $key.log" | tee $key.log
  echo "# autogenerated by $0 $(date)
   { $cmd
     spawn
   } >> $key.log 2>&1 &
  " >| $key.sh
  . $key.sh
  rm -f $key.sh
  return 0
}

while (( primer-- ))  # until we've filled the requested quota
do spawn              # create a child process
done < $cmdfile

Yes, there are security concerns with reading possibly dirty data and sourcing it. I wanted to keep the framework simple as an exercise. Suggestions are still welcome.

I threw together a quick command file with some complex commands built of simple crap just as examples.

a:for x in $( seq 1 10 );do echo "on $x";date;sleep 1;done &
b:true && echo ok || echo no
c:false && echo ok || echo no
d:date > /tmp/x; cat /tmp/x
e:date;sleep 5;date
f:date;sleep 13;date
g:date;sleep 1;date
h:date;sleep 5;date
i:date;sleep 17;date
j:date;sleep 1;date
k:date;sleep 9;date
l:date;sleep 19;date
m:date;sleep 7;date
n:date;sleep 19;date
o:date;sleep 11;date
p:date;sleep 17;date
q:date;sleep 6;date
r:date;sleep 7;date
s:date;sleep 18;date
t:date;sleep 6;date
u:date;sleep 9;date
v:date;sleep 9;date
w:date;sleep 2;date
x:date;sleep 0;date
y:date;sleep 3;date
z:date;sleep 10;date

Note the first one even runs itself in background - the spooler doesn't care. Job a will start b before spool can, so it will skip to c.

Some of the logs -

a - original spawn; ran itself in background and immediately started b, then kept logging

Thu, Oct 25, 2018  2:33:57 PM executing 'for x in $( seq 1 10 );do echo "on $x";date;sleep 1;done &'; c.f. a.log
on 1
Thu, Oct 25, 2018  2:33:58 PM executing 'true && echo ok || echo no'; c.f. b.log
Thu, Oct 25, 2018  2:33:58 PM
on 2
Thu, Oct 25, 2018  2:33:59 PM
on 3
Thu, Oct 25, 2018  2:34:00 PM
on 4
Thu, Oct 25, 2018  2:34:01 PM
on 5
Thu, Oct 25, 2018  2:34:02 PM
on 6
Thu, Oct 25, 2018  2:34:04 PM
on 7
Thu, Oct 25, 2018  2:34:05 PM
on 8
Thu, Oct 25, 2018  2:34:06 PM
on 9
Thu, Oct 25, 2018  2:34:07 PM
on 10
Thu, Oct 25, 2018  2:34:08 PM

b - exited quickly and started f because c, d, & e had already been run

Thu, Oct 25, 2018  2:33:58 PM executing 'true && echo ok || echo no'; c.f. b.log
ok
Thu, Oct 25, 2018  2:33:58 PM executing 'date;sleep 13;date'; c.f. f.log

c - original spawn; finished before b, so it started d, which is why b started f

Thu, Oct 25, 2018  2:33:58 PM executing 'false && echo ok || echo no'; c.f. c.log
no
Thu, Oct 25, 2018  2:33:58 PM executing 'date > /tmp/x; cat /tmp/x'; c.f. d.log

d - started by c, finished and started h because g had already been run

Thu, Oct 25, 2018  2:33:58 PM executing 'date > /tmp/x; cat /tmp/x'; c.f. d.log
Thu, Oct 25, 2018  2:33:58 PM
Thu, Oct 25, 2018  2:33:59 PM executing 'date;sleep 5;date'; c.f. h.log

e - original spawn, started n because everything up to that had been run

Thu, Oct 25, 2018  2:33:58 PM executing 'date;sleep 5;date'; c.f. e.log
Thu, Oct 25, 2018  2:33:58 PM
Thu, Oct 25, 2018  2:34:04 PM
Thu, Oct 25, 2018  2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log

(skipping ahead a bit...)

n - started by e, took long enough to finish there were no more tasks to start

Thu, Oct 25, 2018  2:34:04 PM executing 'date;sleep 19;date'; c.f. n.log
Thu, Oct 25, 2018  2:34:04 PM
Thu, Oct 25, 2018  2:34:23 PM

It works. It isn't perfect, but it could be handy. :)

查看更多
登录 后发表回答