Run Serial inside Paralell Bash

2019-09-19 11:15发布

I have added to my explanation a bit. Conceptually, I am running a script that processes in a loop, calling shells that use the line content as an input parameter.(FYI: a kicks off an execution and b monitors that execution)

  1. I am needing 1a and 1b to run first, in paralell for the first two $param
  2. Next, 2a and 2b need to run in serial for $params when step 1 is complete
  3. 3a and 3b will kick off once 2a and 2b are complete (irrelevant if serial or parallel)
  4. Loop continues with next 2 lines from input .txt

I cant get it to process the second in serial, only all in parallel: What I need is the following

cat filename | while readline 
export param=$line
do
./script1a.sh "param" > process.lg && ./script2b.sh > monitor.log &&
##wait for processes to finish, running 2 in parallel in script1.sh
./script2a.sh "param" > process2.log && ./script2b.sh > minitor2.log &&
##run each of the 2 in serial for script2.sh
./script3a.sh && ./script3b.sh

I tried adding in wait, and tried an if statement containing script2a.sh and script2b.sh that would run in serial, but to no avail.

if ((++i % 2 ==0)) then wait fi
done
#only run two lines at a time, then cycle back through loop

How on earth can I get the script2.sh to run in serial as a result of script1 in parallel??

5条回答
地球回转人心会变
2楼-- · 2019-09-19 11:35

@tripleee I put together the following if interested (note:I changed some variables for the post so sorry if there are inconsistencies anywhere...also the exports have their reasons. I think there is a better way than exporting, but for now it works)

cat input.txt | while read first; do
export step=${first//\"/}
export stepem=EM_${step//,/_}
export steptd=TD_${step//,/_}
export stepeg=EG_${step//,/_}
echo "$step" |  $directory"/ws_client.sh" processOptions  "$appName" "$step" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem""_ProcessID.log" &&
$dir_model"/check_ status.sh" "$Folder" "$stepem" > "$Folder""/""$stepem""_Monitor.log" &
read second
export step2=${second//\"/}
export stepem2=ExecuteModel_${step2//,/_}
export steptd2=TransferData_${step2//,/_}
export stepeg2=ExecuteGeneology_${step2//,/_}
echo "$step2" |  $directory"/ws_client.sh" processOptions  "$appName" "$step2" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem2""_ProcessID.log" && 
$dir _model"/check _status.sh" "$Folder" "$stepem2" > "$Folder""/""$stepem2""_Monitor.log" &
wait
$directory"/ws_client.sh" processOptions "$appName" "$step" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd""_ProcessID.log" &&
$dir _model"/check_status.sh" "$Folder" "$steptd" > "$Folder""/""$steptd""_Monitor.log" &&
$directory"/ws_client.sh" processOptions "$appName" "$step2" "$layers" "" ""  "$stage_final" "" TRUE > "$appLogFolder""/""$steptd2""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$steptd2" > "$Folder""/""$steptd2""_Monitor.log" &
wait
$directory"/ws_client.sh" processPaths "$appName" "$step" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$stepeg" > "$Folder""/""$stepeg""_Monitor.log" &&
$directory"/ws_client.sh" processPaths "$appName" "$step2" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg2""_ProcessID.log" &&
$dir_model"/check _status.sh" "$Folder" "$stepeg2" > "$Folder""/""$stepeg2""_Monitor.log" &
wait
 if (( ++i % 2 == 0))
then
echo "Waiting..."
wait
fi
查看更多
\"骚年 ilove
3楼-- · 2019-09-19 11:46

Locking!

If you want to parallelize script1 and script3, but need all invocations of script2 to be serialized, continue to use:

./script1.sh && ./script2.sh && ./script3.sh &

...but modify script2 to grab a lock before it does anything else:

#!/bin/bash
exec 3>.lock2
flock -x 3
# ... continue with script2's business here.

Note that you must not delete the .lock2 file used here, at risk of allowing multiple processes to think they hold the lock concurrently.

查看更多
Root(大扎)
4楼-- · 2019-09-19 11:50

I understand your question like:

You have a list of models. These models needs to be run. After they are run they have to be transferred. The simple solutions is:

run_model model1
transfer_result model1
run_model model2
transfer_result model2

But to make this go faster, we want to parallelize parts. Unfortunately transfer_result cannot be parallelized.

run_model model1
run_model model2
transfer_result model1
transfer_result model2

model1 and model2 are read from a text file. run_model can be run in parallel, and you would like 2 of those running in parallel. transfer_result can only be run one at a time, and you can only transfer a result when it has been computed.

This can be done like this:

cat models.txt | parallel -j2 'run_model {} && sem --id transfer transfer_model {}'

run_model {} && sem --id transfer transfer_model {} will run one model and if it succeeds transfer it. Transferring will only start if no other transfer is running.

parallel -j2 will run two of the these jobs in parallel.

If transfer takes shorter than computing a model, then you should get no surprises: the transfers will at most be swapped with the next transfer. If transfer takes longer than running a model, you might see that the models are transferred completely out of order (e.g. you might see transfer of job 10 before transfer of job 2). But they will all be transferred eventually.

You can see the execution sequence exemplified with this:

seq 10 | parallel -uj2 'echo ran model {} && sem --id transfer "sleep .{};echo transferred {}"'

This solution is better than the wait based solution because you can run model3 while model1+2 is being transferred.

查看更多
够拽才男人
5楼-- · 2019-09-19 11:54

You are not showing us how the lines you read from the file are being consumed.
If I understand your question correctly, you want to run script1 on two lines of filename, each in parallel, and then serially run script2 when both are done?

while read first; do
    echo "$first" | ./script1.sh &
    read second
    echo "$second" | ./script1.sh &
    wait
    script2.sh &    # optionally don't background here?
    script3.sh
done <filename &

The while loop contains two read statements, so each iteration reads two lines from filename and feeds each to a separate instance of script1. Then we wait until they both are done before we run script2. I background it so that script3 can start while it runs, and background the whole while loop; but you probably don't actually need to background the entire job by default (development will be much easier if you write it as a regular foreground job, then when it works, background the whole thing when you start it if you need to).

I can think of a number of variations on this depending on how you actually want your data to flow; here is an update in response to your recently updated question.

export param  # is this really necessary?
while read param; do
    # First instance
    ./script1a.sh "$param" > process.lg  &&
    ./script2b.sh > monitor.log &

    # Second instance
    read param
    ./script2a.sh "$param" > process2.log && ./script2b.sh > minitor2.log &

    # Wait for both to finish
    wait

    ./script3a.sh && ./script3b.sh
done <filename

If this still doesn't help, maybe you should post a third question where you really actually explain what you want...

查看更多
虎瘦雄心在
6楼-- · 2019-09-19 11:55

I am not 100% sure what you mean with your question, but now I think you mean something like this in your inner loop:

(
   # run script1 and script2 in parallel
   script1 &
   s1pid=$!

   # start no more than one script2 using GNU Parallel as a mutex
   sem --fg script2

   # when they are both done...
   wait $s1pid

   # run script3
   script3

) &    # and do that lot in parallel with previous/next loop iteration
查看更多
登录 后发表回答