How to properly escaping qsub command with long in

2019-07-20 21:53发布

问题:

I have a complex qsub command to run remotely.

PROJECT_NAME_TEXT="TEST PROJECT"
PACK_ORGANIZATION="--source-organization \'MY, ORGANIZATION\'"
CONTACT_NAME="--contact-name \'Tom Riddle\'"
PROJECT_NAME_PACK="--project-name \"${PROJECT_NAME_TEXT}\""


INPUTARGS="${PACK_ORGANIZATION} ${CONTACT_NAME} ${PROJECT_NAME_PACK}"

ssh mycluster "qsub -v argv="$INPUTARGS" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script" 

The problem is the remote cluster doesn't recognise the qsub command, it always showing incorrect qsub command or simply alway queued on cluster because of input args are wrong.

It must be the escaping problem, my question is how to escape the command above properly ?

回答1:

Try doing this using a here-doc : you have a quote conflict (nested double quotes that is an error):

#!/bin/bash

PROJECT_NAME_TEXT="TEST PROJECT"
PACK_ORGANIZATION="--source-organization \'MY, ORGANIZATION\'"
CONTACT_NAME="--contact-name \'Tom Riddle\'"
PROJECT_NAME_PACK="--project-name \"${PROJECT_NAME_TEXT}\""


INPUTARGS="${PACK_ORGANIZATION} ${CONTACT_NAME} ${PROJECT_NAME_PACK}"

ssh mycluster <<EOF
qsub -v argv="$INPUTARGS" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script
EOF

As you can see, here-docs are really helpful for inputs with quotes.

See man bash | less +/'Here Documents'

Edit

from your comments :


I used this method but it gives me "Pseudo-terminal will not be allocated because stdin is not a terminal."

You can ignore this warning with

ssh mycluster <<EOF 2>/dev/null

(try the -t switch for ssh if needed)


If you have

-bash: line 2: EOF: command not found

I think you have a copy paste problem. Try to remove extra spaces on all end lines


And it seems this method cannot pass local variable $INPUTARGS to the remote cluster

it seems related to your EOF problem.


$argv returns nothing on remote cluster

What does this means ? $argv is not a pre-defined variable in bash. If you need to list command line arguments, use the pre-defined variable $@


Last thing : ensure you are using bash



回答2:

Your problem is not the length, but the nesting of your quotes - in this line, you are trying to use " inside ", which won't work:

ssh mycluster "qsub -v argv="$INPUTARGS" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script"

Bash will see this as "qsub -v argv=" followed by $INPUTARGS (not quoted), followed by " -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script".

It's possible that backslash-escaping those inner quotes will have the desired effect, but nesting quotes in bash can get rather confusing. What I often try to do is add an echo at the beginning of the command, to show how the various stages of expansion pan out. e.g.

echo 'As expanded locally:'
echo ssh mycluster "qsub -v argv=\"$INPUTARGS\" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script"
echo 'As expanded remotely:'
ssh mycluster "echo qsub -v argv=\"$INPUTARGS\" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script"


回答3:

Thanks for all the answers, however their methods will not work on my case. I have to answer this by myself since this problem is pretty complex, I got the clue from existing solutions in stackoverflow.

There are 2 problems must be solved in my case.

  1. Pass local program's parameters to the remote cluster. Here-doc solution doesn't work in this case.

  2. Run qsub on remote cluster with the a long variable as arguments that contain quote symbol.

Problem 1.

Firstly, I have to introduce my script that runs on local machine takes parameters like this:

scripttoberunoncluster.py --source-organisation "My_organization_my_department" --project-name "MyProjectName"  --processes 4 /targetoutputfolder/

The real parameter is far more longer than above, so all the parameter must be sent to remote. They are sent in file like this:

PROJECT_NAME="MyProjectName"
PACK_ORGANIZATION="--source-organization '\\\"My_organization_my_department\\\"'" # multiple layers of escaping, remove all the spaces
PROJECT_NAME_PACK="--project-name '\\\"${PROJECT_NAME}\\\"'"
PROCESSES_="--processes 4"
TARGET_FOLDER_PACK="/targetoutputfolder/"

INPUTARGS="${PACK_ORGANIZATION} ${PROJECT_NAME_PACK} ${PROCESSES} ${TARGET_FOLDER_PACK}"

echo $INPUTARGS > "TempPath/temp.par"
scp "TempPath/temp.par" "remotecluster:/remotepath/"

My solution is sort of compromising. But in this way the remote cluster can run script with arguments contain quote symbol. If you don't put all your variable (as parameters) in a file and transfer it to remote cluster, no matter how you pass them into variable, the quote symbol will be removed.

Problem 2.

Check how the qsub runs on remote cluster.

ssh remotecluster "qsub -v argv=\"`cat /remotepath/temp.par`\" -l walltime=10:00:00 /remotepath/my.script"

And in the my.script:

INPUT_ARGS=`echo $argv`

python "/pythonprogramlocation/scripttoberunoncluster.py" $INPUT_ARGS ; #note: $INPUT_ARGS hasn't quote


回答4:

The described escaping problem consists in the requirement to preserve final quotes around arguments after two evaluation processes, i. e. after two evaluations we should see something like:

--source-organization "My_organization_my_department" --project-name "MyProjectName" --processes 4 /targetoutputfolder/

This can be achieved codewise by first putting each argument in a separate variable and then enclosing the argument with single quotes while making sure that possible single quotes inside the argument string get "escaped" with '\'' (in fact, the argument will be split up into separate strings but, when used, the split-up argument will automatically get re-concatenated by the string evaluation mechanism of UNIX (POSIX?) shells). And this procedure has to be repeated three times.

{
escsquote="'\''"

PROJECT_NAME="MyProjectName"

myorg="My_organization_my_department"
myorg="'${myorg//\'/${escsquote}}'" # bash
myorg="'${myorg//\'/${escsquote}}'"
myorg="'${myorg//\'/${escsquote}}'"
PACK_ORGANIZATION="--source-organization ${myorg}"

pnp="${PROJECT_NAME}"
pnp="'${pnp//\'/${escsquote}}'"
pnp="'${pnp//\'/${escsquote}}'"
pnp="'${pnp//\'/${escsquote}}'"
PROJECT_NAME_PACK="--project-name ${pnp}"

PROCESSES="--processes 4"
TARGET_FOLDER_PACK="/targetoutputfolder/"

INPUTARGS="${PACK_ORGANIZATION} ${PROJECT_NAME_PACK} ${PROCESSES} ${TARGET_FOLDER_PACK}"

echo "$INPUTARGS"
eval echo "$INPUTARGS" 
eval eval echo "$INPUTARGS"
echo
ssh -T localhost <<EOF
echo qsub -v argv="$INPUTARGS" -l walltime=10:00:00 -l vmem=8GB -l nodes=1:ppn=4 /myscript_path/run.script
EOF

}

For further information please see:

  • Quotes exercise - how to do ssh inside ssh whilst running sql inside second ssh?
  • Quoting in ssh $host $FOO and ssh $host "sudo su user -c $FOO" type constructs.