可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I have added to my explanation a bit. Conceptually, I am running a script that processes in a loop, calling shells that use the line content as an input parameter.(FYI: a kicks off an execution and b monitors that execution)
- I am needing 1a and 1b to run first, in paralell for the first two $param
- Next, 2a and 2b need to run in serial for $params when step 1 is complete
- 3a and 3b will kick off once 2a and 2b are complete (irrelevant if serial or parallel)
- Loop continues with next 2 lines from input .txt
I cant get it to process the second in serial, only all in parallel: What I need is the following
cat filename | while readline
export param=$line
do
./script1a.sh "param" > process.lg && ./script2b.sh > monitor.log &&
##wait for processes to finish, running 2 in parallel in script1.sh
./script2a.sh "param" > process2.log && ./script2b.sh > minitor2.log &&
##run each of the 2 in serial for script2.sh
./script3a.sh && ./script3b.sh
I tried adding in wait, and tried an if statement containing script2a.sh and script2b.sh that would run in serial, but to no avail.
if ((++i % 2 ==0)) then wait fi
done
#only run two lines at a time, then cycle back through loop
How on earth can I get the script2.sh to run in serial as a result of script1 in parallel??
回答1:
You are not showing us how the lines you read from the file are being consumed.
If I understand your question correctly, you want to run script1
on two lines of filename
, each in parallel, and then serially run script2
when both are done?
while read first; do
echo "$first" | ./script1.sh &
read second
echo "$second" | ./script1.sh &
wait
script2.sh & # optionally don't background here?
script3.sh
done <filename &
The while
loop contains two read
statements, so each iteration reads two lines from filename
and feeds each to a separate instance of script1
. Then we wait
until they both are done before we run script2
. I background it so that script3
can start while it runs, and background the whole while
loop; but you probably don't actually need to background the entire job by default (development will be much easier if you write it as a regular foreground job, then when it works, background the whole thing when you start it if you need to).
I can think of a number of variations on this depending on how you actually want your data to flow; here is an update in response to your recently updated question.
export param # is this really necessary?
while read param; do
# First instance
./script1a.sh "$param" > process.lg &&
./script2b.sh > monitor.log &
# Second instance
read param
./script2a.sh "$param" > process2.log && ./script2b.sh > minitor2.log &
# Wait for both to finish
wait
./script3a.sh && ./script3b.sh
done <filename
If this still doesn't help, maybe you should post a third question where you really actually explain what you want...
回答2:
Locking!
If you want to parallelize script1
and script3
, but need all invocations of script2
to be serialized, continue to use:
./script1.sh && ./script2.sh && ./script3.sh &
...but modify script2
to grab a lock before it does anything else:
#!/bin/bash
exec 3>.lock2
flock -x 3
# ... continue with script2's business here.
Note that you must not delete the .lock2
file used here, at risk of allowing multiple processes to think they hold the lock concurrently.
回答3:
I am not 100% sure what you mean with your question, but now I think you mean something like this in your inner loop:
(
# run script1 and script2 in parallel
script1 &
s1pid=$!
# start no more than one script2 using GNU Parallel as a mutex
sem --fg script2
# when they are both done...
wait $s1pid
# run script3
script3
) & # and do that lot in parallel with previous/next loop iteration
回答4:
@tripleee I put together the following if interested (note:I changed some variables for the post so sorry if there are inconsistencies anywhere...also the exports have their reasons. I think there is a better way than exporting, but for now it works)
cat input.txt | while read first; do
export step=${first//\"/}
export stepem=EM_${step//,/_}
export steptd=TD_${step//,/_}
export stepeg=EG_${step//,/_}
echo "$step" | $directory"/ws_client.sh" processOptions "$appName" "$step" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem""_ProcessID.log" &&
$dir_model"/check_ status.sh" "$Folder" "$stepem" > "$Folder""/""$stepem""_Monitor.log" &
read second
export step2=${second//\"/}
export stepem2=ExecuteModel_${step2//,/_}
export steptd2=TransferData_${step2//,/_}
export stepeg2=ExecuteGeneology_${step2//,/_}
echo "$step2" | $directory"/ws_client.sh" processOptions "$appName" "$step2" "$layers" "$stages" "" "$stages" "$stages" FALSE > "$Folder""/""$stepem2""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$stepem2" > "$Folder""/""$stepem2""_Monitor.log" &
wait
$directory"/ws_client.sh" processOptions "$appName" "$step" "$layers" "" "" "$stage_final" "" TRUE > "$appLogFolder""/""$steptd""_ProcessID.log" &&
$dir _model"/check_status.sh" "$Folder" "$steptd" > "$Folder""/""$steptd""_Monitor.log" &&
$directory"/ws_client.sh" processOptions "$appName" "$step2" "$layers" "" "" "$stage_final" "" TRUE > "$appLogFolder""/""$steptd2""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$steptd2" > "$Folder""/""$steptd2""_Monitor.log" &
wait
$directory"/ws_client.sh" processPaths "$appName" "$step" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg""_ProcessID.log" &&
$dir _model"/check _status.sh" "$Folder" "$stepeg" > "$Folder""/""$stepeg""_Monitor.log" &&
$directory"/ws_client.sh" processPaths "$appName" "$step2" "$layers" "$genPath_01" > "$appLogFolder""/""$stepeg2""_ProcessID.log" &&
$dir_model"/check _status.sh" "$Folder" "$stepeg2" > "$Folder""/""$stepeg2""_Monitor.log" &
wait
if (( ++i % 2 == 0))
then
echo "Waiting..."
wait
fi
回答5:
I understand your question like:
You have a list of models. These models needs to be run. After they are run they have to be transferred. The simple solutions is:
run_model model1
transfer_result model1
run_model model2
transfer_result model2
But to make this go faster, we want to parallelize parts. Unfortunately transfer_result
cannot be parallelized.
run_model model1
run_model model2
transfer_result model1
transfer_result model2
model1
and model2
are read from a text file. run_model
can be run in parallel, and you would like 2 of those running in parallel. transfer_result
can only be run one at a time, and you can only transfer a result when it has been computed.
This can be done like this:
cat models.txt | parallel -j2 'run_model {} && sem --id transfer transfer_model {}'
run_model {} && sem --id transfer transfer_model {}
will run one model and if it succeeds transfer it. Transferring will only start if no other transfer is running.
parallel -j2
will run two of the these jobs in parallel.
If transfer takes shorter than computing a model, then you should get no surprises: the transfers will at most be swapped with the next transfer. If transfer takes longer than running a model, you might see that the models are transferred completely out of order (e.g. you might see transfer of job 10 before transfer of job 2). But they will all be transferred eventually.
You can see the execution sequence exemplified with this:
seq 10 | parallel -uj2 'echo ran model {} && sem --id transfer "sleep .{};echo transferred {}"'
This solution is better than the wait
based solution because you can run model3 while model1+2 is being transferred.