可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have list of shell commands that I'd like to call. Up to four processes shall run at the same time.

My basic idea would be to send the commands to the shell until 4 commands are active. The script then constantly checks the process count of all processes by looking for a common string e.g. "nohup scrapy crawl urlMonitor".

As soon as the process count drops below 4, the next command is sent to the shell until all command have finished.

Is there a way to do this with a shell script? I suppose it would involve some kind of endless loop, and break condition as well as method to check for the active processes. Unfortunately I am not that good in shell scripting, so perhaps someone can guide me into the right direction?

nohup scrapy crawl urlMonitor -a slice=0 &
nohup scrapy crawl urlMonitor -a slice=1 &
nohup scrapy crawl urlMonitor -a slice=2 &
nohup scrapy crawl urlMonitor -a slice=3 &
nohup scrapy crawl urlMonitor -a slice=4 &
nohup scrapy crawl urlMonitor -a slice=5 &
nohup scrapy crawl urlMonitor -a slice=6 &
nohup scrapy crawl urlMonitor -a slice=7 &
nohup scrapy crawl urlMonitor -a slice=8 &
nohup scrapy crawl urlMonitor -a slice=9 &
nohup scrapy crawl urlMonitor -a slice=10 &
nohup scrapy crawl urlMonitor -a slice=11 &
nohup scrapy crawl urlMonitor -a slice=12 &
nohup scrapy crawl urlMonitor -a slice=13 &
nohup scrapy crawl urlMonitor -a slice=14 &
nohup scrapy crawl urlMonitor -a slice=15 &
nohup scrapy crawl urlMonitor -a slice=16 &
nohup scrapy crawl urlMonitor -a slice=17 &
nohup scrapy crawl urlMonitor -a slice=18 &
nohup scrapy crawl urlMonitor -a slice=19 &
nohup scrapy crawl urlMonitor -a slice=20 &
nohup scrapy crawl urlMonitor -a slice=21 &
nohup scrapy crawl urlMonitor -a slice=22 &
nohup scrapy crawl urlMonitor -a slice=23 &
nohup scrapy crawl urlMonitor -a slice=24 &
nohup scrapy crawl urlMonitor -a slice=25 &
nohup scrapy crawl urlMonitor -a slice=26 &
nohup scrapy crawl urlMonitor -a slice=27 &
nohup scrapy crawl urlMonitor -a slice=28 &
nohup scrapy crawl urlMonitor -a slice=29 &
nohup scrapy crawl urlMonitor -a slice=30 &
nohup scrapy crawl urlMonitor -a slice=31 &
nohup scrapy crawl urlMonitor -a slice=32 &
nohup scrapy crawl urlMonitor -a slice=33 &
nohup scrapy crawl urlMonitor -a slice=34 &
nohup scrapy crawl urlMonitor -a slice=35 &
nohup scrapy crawl urlMonitor -a slice=36 &
nohup scrapy crawl urlMonitor -a slice=37 &
nohup scrapy crawl urlMonitor -a slice=38 &

回答1:

Here's a general method that will always ensure that there are less than 4 jobs before launching any other jobs (yet, there may be more than 4 jobs simultaneously if a line launches several jobs at once):

#!/bin/bash

max_nb_jobs=4
commands_file=$1

while IFS= read -r line; do
   while :; do
      mapfile -t jobs < <(jobs -pr)
      ((${#jobs[@]}<max_nb_jobs)) && break
      wait -n
   done
   eval "$line"
done < "$commands_file"

wait

Use this script with your file as first argument.

How does it work? for each line line read, we first ensure that there are less than max_nb_jobs running by counting the number of running jobs (obtained from jobs -pr). If there are more than max_nb_jobs, we wait for the next job to terminate (wait -n), and count again the number of running jobs. If there are less than max_nb_jobs running, we eval the line line.

Update

Here's a similar script that doesn't use wait -n. It seems to do the job all right (tested on Debian with Bash 4.2):

#!/bin/bash

set -m

max_nb_jobs=4
file_list=$1

sleep_jobs() {
   # This function sleeps until there are less than $1 jobs running
   # Make sure that you have set -m before using this function!
   local n=$1 jobs
   while mapfile -t jobs < <(jobs -pr) && ((${#jobs[@]}>=n)); do
      coproc read
      trap "echo >&${COPROC[1]}; trap '' SIGCHLD" SIGCHLD
      wait $COPROC_PID
   done
}

while IFS= read -r line; do
   sleep_jobs $max_nb_jobs
   eval "$line"
done < "$file_list"

wait

回答2:

If you want 4 at a time continuously running, try something like:

max_procs=4
active_procs=0

for proc_num in {0..38}; do
    nohup your_cmd_here &

    # If we have more than max procs running, wait for one to finish
    if ((active_procs++ >= max_procs)); then
        wait -n
        ((active_procs--))
    fi
done

# Wait for all remaining procs to finish
wait

This is a variation on sputnick's answer that keeps up to max_procs running at the same time. As soon as one finishes, it kicks off the next one. The wait -n command waits for the next process to finish instead of waiting for all of them to finish.

回答3:

You could do this easily with GNU parallel or even just xargs. To wit:

declare -i i=0
while sleep 1; do
    printf 'slice=%d\n' $((i++))
done | xargs -n1 -P3 nohup scrapy crawl urlMonitor -a

The while loop will run forever; if there's an actual hard limit you know of you can just do a for loop like:

for i in {0..100}…

Also, the sleep 1 is helpful because it lets the shell handle signals more effectively.

回答4:

Try doing this :

for i in {0..38}; do
    nohup scrapy crawl urlMonitor -a slice=$i & _pid=$!
    ((++i%4==0)) && wait $_pid
done

help wait:

wait: wait [-n] [id ...]
Wait for job completion and return exit status.

Waits for each process identified by an ID, which may be a process ID or a
job specification, and reports its termination status.  If ID is not
given, waits for all currently active child processes, and the return
status is zero.  If ID is a a job specification, waits for all processes
in that job's pipeline.

If the -n option is supplied, waits for the next job to terminate and
returns its exit status.

Exit Status:
Returns the status of the last ID; fails if ID is invalid or an invalid
option is given.