$SGE_TASK_ID not getting set with qsub array grid

2019-07-18 00:22发布

With a very simple zsh script:

#!/bin/zsh

nums=(1 2 3)
num=$nums[$SGE_TASK_ID]

$SGE_TASK_ID is the sun-grid engine task id. I'm using qsub to submit an array of jobs.

I am following what is advised in the qsub manpage (http://www.clusterresources.com/torquedocs/commands/qsub.shtml#t) and submitting my array job as

#script name: job_script.sh
qsub job_script.sh -t 1-3

$SGE_TASK_ID is not being set for this array job... does anyone have any ideas why?

Thanks!

4条回答
Evening l夕情丶
2楼-- · 2019-07-18 00:48

You need to surround the array variable with curly braces:

SGE_TASK_ID=2
nums=(1 2 3)
num=${nums[$SGE_TASK_ID]}
echo "num: $num"
# prints "num: 3"

The Linux Documentation Project has the best shell scripting docs

查看更多
聊天终结者
3楼-- · 2019-07-18 00:51

Try submitting the job like this:

qsub -t 1-3 job_script.sh

and see what happens.

Observe:

qsub -sync y job_script.sh -t 1-3
Your job 74578 ("job_script.sh") has been submitted
Job 74578 exited with exit code 0.

vs.

qsub -sync y -t 1-3 job_script.sh
Your job-array 74579.1-3:1 ("job_script.sh") has been submitted
Job 74579.3 exited with exit code 0.
Job 74579.1 exited with exit code 0.
Job 74579.2 exited with exit code 0.

Note, torque (the referenced man page in your question) is a little different from SGE. My SGE man page definitely suggests putting all options before the command. Also, SGE doesn't like that "%" syntax for limiting the max number of simultaneous jobs, but mine at least lets me say -tc NNN to specify the limit (not mentioned in the man page, but in qsub -help).

查看更多
Fickle 薄情
4楼-- · 2019-07-18 00:52

thanks everyone for the answers. I found a solution that works:

Depending on how the cluster is set up, the Sun Grid Engine might be configured to use another variable name for array ids.. This was the case for me. I found out the variable to use by doing the following:

// job_script.sh

#!/bin/zsh
env >> ~/job_env
set >> ~/job_env

This dumps all environment variables set by the script into a file called job_env. Just simply look in the file and look for a variable array ID that is incremented for each job. Should not be that hard to find.

Remember to submit the job_script.sh with qsub as follows:

qsub -t 1-3 job_script.sh

In my case, the ID that was set was $PBS_ARRAYID. I don't think that's the default so $SGE_TASK_ID should work for standard SGE setups on clusters.

Cheers!

查看更多
Evening l夕情丶
5楼-- · 2019-07-18 01:00

To access to a position in your array, you have to do it like this: ${the_array[$the_position]}.

So in your case,

num=${nums[$SGE_TASK_ID]}

Test:

$ nums=(1 2 3)
$ SGE_TASK_ID=1
$ echo ${nums[$SGE_TASK_ID]}
2

note that the first position is the 0th.

查看更多
登录 后发表回答