Bash shell test if all characters in one string ar

2019-08-01 08:01发布

问题:

I have two strings which I want to compare for equal chars, the strings must contain the exact chars but mychars can have extra chars.

mychars="abcdefg"
testone="abcdefgh"        # false h is not in mychars
testtwo="abcddabc"        # true all char in testtwo are in mychars

function test() {
    if each char in $1 is in $2  # PSEUDO CODE
    then
      return 1
    else
      return 0
    fi
}

if test $testone $mychars; then
   echo "All in the string" ;
else ;  echo "Not all in the string" ; fi

# should echo "Not all in the string" because the h is not in the string mychars

if test $testtwo $mychars; then
   echo "All in the string" ;
else ;  echo "Not all in the string" ; fi

# should echo 'All in the string'

What is the best way to do this? My guess is to loop over all the chars in the first parameter.

回答1:

You can use tr to replace any char from mychars with a symbol, then you can test if the resulting string is any different from the symbol, p.e.,:

tr -s "[$mychars]" "." <<< "ggaaabbbcdefg"

Outputs:

.

But:

tr -s "[$mychars]" "." <<< "xxxggaaabbbcdefgxxx"

Prints:

xxx.xxx

So, your function could be like the following:

function test() {
    local dictionary="$1"
    local res=$(tr -s "[$dictionary]" "." <<< "$2")
    if [ "$res" == "." ]; then 
        return 1
    else
        return 0
    fi
}

Update: As suggested by @mklement0, the whole function could be shortened (and the logic fixed) by the following:

function test() {
    local dictionary="$1"
    [[ '.' == $(tr -s "[$dictionary]" "." <<< "$2") ]] 
}


回答2:

The accepted answer's solution is short, clever, and efficient.

Here's a less efficient alternative, which may be of interest if you want to know which characters are unique to the 1st string, returned as a sorted, distinct list:

charTest() {
  local charsUniqueToStr1
  # Determine which chars. in $1 aren't in $2.
  # This returns a sorted, distinct list of chars., each on its own line.
  charsUniqueToStr1=$(comm -23 \
    <(sed 's/\(.\)/\1\'$'\n''/g' <<<"$1" | sort -u) \
    <(sed 's/\(.\)/\1\'$'\n''/g' <<<"$2" | sort -u))
  # The test succeeds if there are no chars. in $1 that aren't also in $2.
  [[ -z $charsUniqueToStr1 ]]
}

mychars="abcdefg" # define reference string

charTest "abcdefgh" "$mychars" 
echo $? # print exit code: 1 - 'h' is not in reference string

charTest "abcddabc" "$mychars"
echo $? # print exit code: 0 - all chars. are in reference string

Note that I've renamed test() to charTest() to avoid a name collision with the test builtin/utility.

  • sed 's/\(.\)/\1\'$'\n''/g' splits the input into individual characters by placing each on a separate line.
    • Note that the command creates an extra empty line at the end, but that doesn't matter in this case; to eliminate it, append ; ${s/\n$//;} to the sed script.
    • The command is written in a POSIX-compliant manner, which complicates it, due to having to splice in an \-escaped actual newline (via an ANSI C-quoted string, $\n'); if you have GNU sed, you can simplify to sed -r 's/(.)/\1\n/g
  • sort -u then sorts the resulting list of characters and weeds out duplicates (-u).
  • comm -23 compares the distinct set of sorted characters in both strings and prints those unique to the 1st string (comm uses a 3-column layout, with the 1st column containing lines unique to the 1st file, the 2nd column containing lines unique to the 2nd column, and the 3rd column printing lines the two input files have in common; -23 suppresses the 2nd and 3rd columns, effectively only printing the lines that are unique to the 1st input).
  • [[ -z $charsUniqueToStr1 ]] then tests if $charsUniqueToStr1 is empty (-z);
    in other words: success (exit code 0) is indicated, if the 1st string contains no chars. that aren't also contained in the 2nd string; otherwise, failure (exit code 1); by virtue of the conditional ([[ .. ]]) being the last statement in the function, its exit code also becomes the function's exit code.