bash regex with quotes?

2018-12-31 20:42发布

问题:

The following code

number=1
if [[ $number =~ [0-9] ]]
then
  echo matched
fi

works. If I try to use quotes in the regex, however, it stops:

number=1
if [[ $number =~ \"[0-9]\" ]]
then
  echo matched
fi

I tried \"\\[0-9\\]\", too. What am I missing?

Funnily enough, bash advanced scripting guide suggests this should work.

Bash version 3.2.39.

回答1:

It was changed between 3.1 and 3.2. Guess the advanced guide needs an update.

This is a terse description of the new features added to bash-3.2 since the release of bash-3.1. As always, the manual page (doc/bash.1) is the place to look for complete descriptions.

  1. New Features in Bash

snip

f. Quoting the string argument to the [[ command\'s =~ operator now forces string matching, as with the other pattern-matching operators.

Sadly this\'ll break existing quote using scripts unless you had the insight to store patterns in variables and use them instead of the regexes directly. Example below.

$ bash --version
GNU bash, version 3.2.39(1)-release (i486-pc-linux-gnu)
Copyright (C) 2007 Free Software Foundation, Inc.
$ number=2
$ if [[ $number =~ \"[0-9]\" ]]; then echo match; fi
$ if [[ $number =~ [0-9] ]]; then echo match; fi
match
$ re=\"[0-9]\"
$ if [[ $number =~ $re ]]; then echo MATCH; fi
MATCH

$ bash --version
GNU bash, version 3.00.0(1)-release (i586-suse-linux)
Copyright (C) 2004 Free Software Foundation, Inc.
$ number=2
$ if [[ $number =~ \"[0-9]\" ]]; then echo match; fi
match
$ if [[ \"$number\" =~ [0-9] ]]; then echo match; fi
match


回答2:

Bash 3.2 introduced a compatibility option compat31 which reverts bash regular expression quoting behavior back to 3.1

Without compat31:

$ shopt -u compat31
$ shopt compat31
compat31        off
$ set -x
$ if [[ \"9\" =~ \"[0-9]\" ]]; then echo match; else echo no match; fi
+ [[ 9 =~ \\[0-9] ]]
+ echo no match
no match

With compat31:

$ shopt -s compat31
+ shopt -s compat31
$ if [[ \"9\" =~ \"[0-9]\" ]]; then echo match; else echo no match; fi
+ [[ 9 =~ [0-9] ]]
+ echo match
match

Link to patch: http://ftp.gnu.org/gnu/bash/bash-3.2-patches/bash32-039



回答3:

GNU bash, version 4.2.25(1)-release (x86_64-pc-linux-gnu)

Some examples of string match and regex match

    $ if [[ 234 =~ \"[0-9]\" ]]; then echo matches;  fi # string match
    $ 

    $ if [[ 234 =~ [0-9] ]]; then echo matches;  fi # regex natch 
    matches


    $ var=\"[0-9]\"

    $ if [[ 234 =~ $var ]]; then echo matches;  fi # regex match
    matches


    $ if [[ 234 =~ \"$var\" ]]; then echo matches;  fi # string match after substituting $var as [0-9]

    $ if [[ \'rss$var919\' =~ \"$var\" ]]; then echo matches;  fi   # string match after substituting $var as [0-9]

    $ if [[ \'rss$var919\' =~ $var ]]; then echo matches;  fi # regex match after substituting $var as [0-9]
    matches


    $ if [[ \"rss\\$var919\" =~ \"$var\" ]]; then echo matches;  fi # string match won\'t work

    $ if [[ \"rss\\\\$var919\" =~ \"$var\" ]]; then echo matches;  fi # string match won\'t work


    $ if [[ \"rss\'$var\'\"\"919\" =~ \"$var\" ]]; then echo matches;  fi # $var is substituted on LHS & RHS and then string match happens 
    matches

    $ if [[ \'rss$var919\' =~ \"\\$var\" ]]; then echo matches;  fi # string match !
    matches



    $ if [[ \'rss$var919\' =~ \"$var\" ]]; then echo matches;  fi # string match failed
    $ 

    $ if [[ \'rss$var919\' =~ \'$var\' ]]; then echo matches;  fi # string match
    matches



    $ echo $var
    [0-9]

    $ 

    $ if [[ abc123def =~ \"[0-9]\" ]]; then echo matches;  fi

    $ if [[ abc123def =~ [0-9] ]]; then echo matches;  fi
    matches

    $ if [[ \'rss$var919\' =~ \'$var\' ]]; then echo matches;  fi # string match due to single quotes on RHS $var matches $var
    matches


    $ if [[ \'rss$var919\' =~ $var ]]; then echo matches;  fi # Regex match 
    matches
    $ if [[ \'rss$var\' =~ $var ]]; then echo matches;  fi # Above e.g. really is regex match and not string match
    $


    $ if [[ \'rss$var919[0-9]\' =~ \"$var\" ]]; then echo matches;  fi # string match RHS substituted and then matched
    matches

    $ if [[ \'rss$var919\' =~ \"\'$var\'\" ]]; then echo matches;  fi # trying to string match \'$var\' fails


    $ if [[ \'$var\' =~ \"\'$var\'\" ]]; then echo matches;  fi # string match still fails as single quotes are omitted on RHS 

    $ if [[ \\\'$var\\\' =~ \"\'$var\'\" ]]; then echo matches;  fi # this string match works as single quotes are included now on RHS
    matches


回答4:

As mentioned in other answers, putting the regular expression in a variable is a general way to achieve compatibility over different bash versions. You may also use this workaround to achieve the same thing, while keeping your regular expression within the conditional expression:

$ number=1
$ if [[ $number =~ $(echo \"[0-9]\") ]]; then echo matched; fi
matched
$