How to parse $QUERY_STRING from a bash CGI script

2019-01-06 19:20发布

问题:

I have a bash script that is being used in a CGI. The CGI sets the $QUERY_STRING environment variable by reading everything after the ? in the URL. For example, http://example.com?a=123&b=456&c=ok sets QUERY_STRING=a=123&b=456&c=ok.

Somewhere I found the following ugliness:

b=$(echo "$QUERY_STRING" | sed -n 's/^.*b=\([^&]*\).*$/\1/p' | sed "s/%20/ /g")

which will set $b to whatever was found in $QUERY_STRING for b. However, my script has grown to have over ten input parameters. Is there an easier way to automatically convert the parameters in $QUERY_STRING into environment variables usable by bash?

Maybe I'll just use a for loop of some sort, but it'd be even better if the script was smart enough to automatically detect each parameter and maybe build an array that looks something like this:

${parm[a]}=123
${parm[b]}=456
${parm[c]}=ok

How could I write code to do that?

回答1:

Try this:

saveIFS=$IFS
IFS='=&'
parm=($QUERY_STRING)
IFS=$saveIFS

Now you have this:

parm[0]=a
parm[1]=123
parm[2]=b
parm[3]=456
parm[4]=c
parm[5]=ok

In Bash 4, which has associative arrays, you can do this (using the array created above):

declare -A array
for ((i=0; i<${#parm[@]}; i+=2))
do
    array[${parm[i]}]=${parm[i+1]}
done

which will give you this:

array[a]=123
array[b]=456
array[c]=ok

Edit:

To use indirection in Bash 2 and later (using the parm array created above):

for ((i=0; i<${#parm[@]}; i+=2))
do
    declare var_${parm[i]}=${parm[i+1]}
done

Then you will have:

var_a=123
var_b=456
var_c=ok

You can access these directly:

echo $var_a

or indirectly:

for p in a b c
do
    name="var$p"
    echo ${!name}
done

If possible, it's better to avoid indirection since it can make code messy and be a source of bugs.



回答2:

you can break $QUERY down using IFS. For example, setting it to &

$ QUERY="a=123&b=456&c=ok"
$ echo $QUERY
a=123&b=456&c=ok
$ IFS="&"
$ set -- $QUERY
$ echo $1
a=123
$ echo $2
b=456
$ echo $3
c=ok

$ array=($@)

$ for i in "${array[@]}"; do IFS="=" ; set -- $i; echo $1 $2; done
a 123
b 456
c ok

And you can save to a hash/dictionary in Bash 4+

$ declare -A hash
$ for i in "${array[@]}"; do IFS="=" ; set -- $i; hash[$1]=$2; done
$ echo ${hash["b"]}
456


回答3:

Please don't use the evil eval junk.

Here's how you can reliably parse the string and get an associative array:

declare -A param   
while IFS='=' read -r -d '&' key value && [[ -n "$key" ]]; do
    param["$key"]=$value
done <<<"${QUERY_STRING}&"

If you don't like the key check, you could do this instead:

declare -A param   
while IFS='=' read -r -d '&' key value; do
    param["$key"]=$value
done <<<"${QUERY_STRING:+"${QUERY_STRING}&"}"

Listing all the keys and values from the array:

for key in "${!param[@]}"; do
    echo "$key: ${param[$key]}"
done


回答4:

To converts the contents of QUERY_STRING into bash variables use the following command:

eval $(echo ${QUERY_STRING//&/;})

The inner step, echo ${QUERY_STRING//&/;}, substitutes all ampersands with semicolons producing a=123;b=456;c=ok which the eval then evaluates into the current shell.

The result can then be used as bash variables.

echo $a
echo $b
echo $c

The assumptions are:

  • values will never contain '&'
  • values will never contain ';'
  • QUERY_STRING will never contain malicious code


回答5:

I packaged the sed command up into another script:

$cat getvar.sh

s='s/^.*'${1}'=\([^&]*\).*$/\1/p'
echo $QUERY_STRING | sed -n $s | sed "s/%20/ /g"

and I call it from my main cgi as:

id=`./getvar.sh id`
ds=`./getvar.sh ds`
dt=`./getvar.sh dt`

...etc, etc - you get idea.

works for me even with a very basic busybox appliance (my PVR in this case).



回答6:

A nice way to handle CGI query strings is to use Haserl which acts as a wrapper around your Bash cgi script, and offers convenient and secure query string parsing.



回答7:

I would simply replace the & to ;. It will become to something like:

a=123;b=456;c=ok

So now you need just evaluate and read your vars:

eval `echo "${QUERY_STRING}"|tr '&' ';'`
echo $a
echo $b
echo $c


回答8:

Following the correct answer, I've done myself some changes to support array variables like in this other question. I added also a decode function of which I can not find the author to give some credit.

Code appears somewhat messy, but it works. Changes and other recommendations would be greatly appreciated.

function cgi_decodevar() {
    [ $# -ne 1 ] && return
    local v t h
    # replace all + with whitespace and append %%
    t="${1//+/ }%%"
    while [ ${#t} -gt 0 -a "${t}" != "%" ]; do
        v="${v}${t%%\%*}" # digest up to the first %
        t="${t#*%}"       # remove digested part
        # decode if there is anything to decode and if not at end of string
        if [ ${#t} -gt 0 -a "${t}" != "%" ]; then
            h=${t:0:2} # save first two chars
            t="${t:2}" # remove these
            v="${v}"`echo -e \\\\x${h}` # convert hex to special char
        fi
    done
    # return decoded string
    echo "${v}"
    return
}

saveIFS=$IFS
IFS='=&'
VARS=($QUERY_STRING)
IFS=$saveIFS

for ((i=0; i<${#VARS[@]}; i+=2))
do
  curr="$(cgi_decodevar ${VARS[i]})"
  next="$(cgi_decodevar ${VARS[i+2]})"
  prev="$(cgi_decodevar ${VARS[i-2]})"
  value="$(cgi_decodevar ${VARS[i+1]})"

  array=${curr%"[]"}

  if  [ "$curr" == "$next" ] && [ "$curr" != "$prev" ] ;then
      j=0
      declare var_${array}[$j]="$value"
  elif [ $i -gt 1 ] && [ "$curr" == "$prev" ]; then
    j=$((j + 1))
    declare var_${array}[$j]="$value"
  else
    declare var_$curr="$value"
  fi
done


回答9:

To bring this up to date, if you have a recent Bash version then you can achieve this with regular expressions:

q="$QUERY_STRING"
re1='^(\w+=\w+)&?'
re2='^(\w+)=(\w+)$'
declare -A params
while [[ $q =~ $re1 ]]; do
  q=${q##*${BASH_REMATCH[0]}}       
  [[ ${BASH_REMATCH[1]} =~ $re2 ]] && params+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})
done

If you don't want to use associative arrays then just change the penultimate line to do what you want. For each iteration of the loop the parameter is in ${BASH_REMATCH[1]} and its value is in ${BASH_REMATCH[2]}.

Here is the same thing as a function in a short test script that iterates over the array outputs the query string's parameters and their values

#!/bin/bash
QUERY_STRING='foo=hello&bar=there&baz=freddy'

get_query_string() {
  local q="$QUERY_STRING"
  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'
  while [[ $q =~ $re1 ]]; do
    q=${q##*${BASH_REMATCH[0]}}
    [[ ${BASH_REMATCH[1]} =~ $re2 ]] && eval "$1+=([${BASH_REMATCH[1]}]=${BASH_REMATCH[2]})"
  done
}

declare -A params
get_query_string params

for k in "${!params[@]}"
do
  v="${params[$k]}"
  echo "$k : $v"
done          

Note the parameters end up in the array in reverse order (it's associative so that shouldn't matter).



回答10:

why not this

    $ echo "${QUERY_STRING}"
    name=carlo&last=lanza&city=pfungen-CH
    $ saveIFS=$IFS
    $ IFS='&'
    $ eval $QUERY_STRING
    $ IFS=$saveIFS

now you have this

    name = carlo
    last = lanza
    city = pfungen-CH

    $ echo "name is ${name}"
    name is carlo
    $ echo "last is ${last}"
    last is lanza
    $ echo "city is ${city}"
    city is pfungen-CH


回答11:

@giacecco

To include a hiphen in the regex you could change the two lines as such in answer from @starfry.

Change these two lines:

  local re1='^(\w+=\w+)&?'
  local re2='^(\w+)=(\w+)$'

To these two lines:

  local re1='^(\w+=(\w+|-|)+)&?'
  local re2='^(\w+)=((\w+|-|)+)$'


回答12:

For all those who couldn't get it working with the posted answers (like me), this guy figured it out.

Can't upvote his post unfortunately...

Let me repost the code here real quick:

#!/bin/sh

if [ "$REQUEST_METHOD" = "POST" ]; then
  if [ "$CONTENT_LENGTH" -gt 0 ]; then
      read -n $CONTENT_LENGTH POST_DATA <&0
  fi
fi

#echo "$POST_DATA" > data.bin
IFS='=&'
set -- $POST_DATA

#2- Value1
#4- Value2
#6- Value3
#8- Value4

echo $2 $4 $6 $8

echo "Content-type: text/html"
echo ""
echo "<html><head><title>Saved</title></head><body>"
echo "Data received: $POST_DATA"
echo "</body></html>"

Hope this is of help for anybody.

Cheers



回答13:

While the accepted answer is probably the most beautiful one, there might be cases where security is super-important, and it needs to be also well-visible from your script.

In such a case, first I wouldn't use bash for the task, but if it should be done on some reason, it might be better to avoid these new array - dictionary features, because you can't be sure, how exactly are they escaped.

In this case, the good old primitive solutions might work:

QS="${QUERY_STRING}"
while [ "${QS}" != "" ]
do
  nameval="${QS%%&*}"
  QS="${QS#$nameval}"
  QS="${QS#&}"
  name="${nameval%%=*}"
  val="${nameval#$name}"
  val="${nameval#=}"

  # and here we have $name and $val as names and values

  # ...

done

This iterates on the name-value pairs of the QUERY_STRING, and there is no way to circumvent it with any tricky escape sequence - the " is a very strong thing in bash, except a single variable name substitution, which is fully controlled by us, nothing can be tricked.

Furthermore, you can inject your own processing code into "# ...". This enables you to allow only your own, well-defined (and, ideally, short) list of the allowed variable names. Needless to say, LD_PRELOAD shouldn't be one of them. ;-)

Furthermore, no variable will be exported, and exclusively QS, nameval, name and val is used.



标签: bash cgi