How to iterate over double-quoted strings in POSIX

2019-04-16 00:08发布

问题:

I am trying to check if all the non POSIX commands that my script depends on are present before my script proceeds with its main job. This will help me to ensure that my script does not generate errors later due to missing commands.

I want to keep the list of all such non POSIX commands in a variable called DEPS so that as the script evolves and depends on more commands, I can edit this variable.

I want the script to support commands with spaces in them, e.g. my program.

This is my script.

#!/bin/sh
DEPS='ssh scp "my program" sftp'

for i in $DEPS
do
    echo "Checking $i ..."
    if ! command -v "$i"
    then
        echo "Error: $i not found"
    else
        echo "Success: $i found"
    fi
    echo
done

However, this doesn't work, because "my program" is split into two words while the for loop iterates: "my and program" as you can see in the output below.

# sh foo.sh
Checking ssh ...
/usr/bin/ssh
Success: ssh found

Checking scp ...
/usr/bin/scp
Success: scp found

Checking "my ...
Error: "my not found

Checking program" ...
Error: program" not found

Checking sftp ...
/usr/bin/sftp
Success: sftp found

The output I expected is:

# sh foo.sh
Checking ssh ...
/usr/bin/ssh
Success: ssh found

Checking scp ...
/usr/bin/scp
Success: scp found

Checking my program ...
Error: my program not found

Checking sftp ...
/usr/bin/sftp
Success: sftp found

How can I solve this problem while keeping the script POSIX compliant?

回答1:

I'll repeat the answer I gave to your previous question: use a while loop with a here document rather than a for loop. You can embed newlines in a string, which is all you need to separate command names in a string if those command names might contain whitespace. (If your command names contain newlines, strongly consider renaming them.)

For maximum POSIX compatibility, use printf, since the POSIX specification of echo is remarkably lax due to differences in how echo was implemented in various shells prior to the definition of the standard.

deps="ssh
scp
my program
sftp
"
while read -r cmd; do
    printf "Checking $cmd ...\n"
    if ! command -v "$cmd"; then
        printf "Error: $i not found\n"
    else
        printf "Success: $cmd found\n"
    fi
    printf "\n"
done <<EOF
$deps
EOF


回答2:

This happens because the steps after parameter expansion are string-splitting and glob-expansion -- not syntax-level parsing (such as handling quoting). To go all the way back to the beginning of the parsing process, you need to use eval.


Frankly, the best approaches are to either:

  1. Target a shell that supports arrays (ksh, bash, zsh, etc) rather than trying to support POSIX
  2. Don't try to retrieve the value from a variable.

...there's a reason proper array support is ubiquitous in modern shells; writing unambiguously correct code, particularly when handling untrusted data, is much harder without it.


That said, you have the option of using $@ to store your contents, which can be set, albeit dangerously, using eval:

deps='goodbye "cruel world"'
eval "set -- $deps"
for program; do
  echo "processing $program"
done

If you do this inside of a function, you'll override only the function's argument list, leaving the global list unmodified.

Alternately, eval "yourfunction $deps" will have the same effect, setting the argument list within the function to the results of running all the usual parsing and expansion phases on the contents of $deps.



回答3:

Because the script is in your controll, you can use the eval with reasonable safety, so @Charles Duffy's answer is an simple and good solution. Use it. :)

Also, consider to use the autoconf for generating the usual configure script what is doing good job for what you need - e.g. checking commands and much more... At least, check some configure scripts for ideas how to solvle common problems...

If you want play with your own implementation:

  • divide the dependecies into two groups
    • core_deps - unix tools, what are commonly needed for the script itself, like sed, cat cp and such. Those programs doesn't contains spaces in their names, nor in the $PATH.
    • runtime_deps - programs, what are needed for your application, but not for the script itself.
  • do the checks in two steps (or more, for example if you need check e.g. libraries)
  • never use the for loop for space delimited elements unless you getting them as the function arguments - so you can use the "$@"

As starting script could be something like the following:

_check_core_deps() {
    for _cmd
    do
        _cpath=$(command -v "$_cmd")
        case "$_cpath" in
        /*) continue;;
        *) echo "Missing install dependency [$_cmd] - can't continue" ; exit 1 ;;
        esac
    done
    return 0
}

core_deps="grep sed hooloovoo cp"   #list of "core" commands - they doesn't contains spaces
_check_core_deps $core_deps || exit 1

The above will blow up on non-existent "hooloovoo" command. :)

Now you can safely continue, all core commands needed for the install script are available. In the next step, you can check other strange dependencies.

Some ideas:

# function what returns your dependecies as lines from HEREDOC
# (e.g. could contain any character except "\n")
# you can decorate the dependecies with comments...
# because we have sed (checked in the 1st step, can use it)
# if want, you can add "fields" too, for some extended functinality with an specified delimiter
list_deps() {
    _sptab=$(printf " \t")  # the $' \t' is approved by POSIX for the next version only
    #the "sed" removes comments and empty lines
    #the UUOC (useless use of cat) is intentional here
    #for example if you want add "tr" before the "sed"
    #of course, you can remove it...
    cat - <<DEPS |sed "s/[$_sptab]*#.*//;/^[$_sptab]*$/d"
########## DEPENDECIES ############
#some comment
ssh
scp
sftp
        #comment
#bla bla
my program  #some comment
/Applications/Some Long And Spaced OSX Apllication.app
DEPS
########## END of DEPENDECIES #####
}

_check_deps() {
#in the "while" loop you can use IFS=: or such and adding anouter variable to read 
#for getting more fields for some extended functionality
list_deps | while read -r line
do
    #do any checks with the line
    #implement additional functionalities as functions
    #etc...
    #remember - your in an subshell here
    printf "command:%s\n" "$line"
done
} 

_check_deps

One more thing :), (or two)

  • if you doubt about the content of some variables, don't use the echo. The POSIX isn't defines how it should act when contains escaped characters (e.g. echo "some\nwed"). Use:
printf '%s' "$variable"
  • never use uppercase only variables like "DEPS"... they're only for environment variables...