This is my problem. In bash 3:
$ test='One "This is two" Three'
$ set -- $test
$ echo $2
"This
How to get bash to understand the quotes and return $2 as This is two
and not "This
? Unfortunately I cannot alter the construction of the variable called test
in this example.
The reason this happens is because of the order in which the shell parses the command line: it parses (and removes) quotes and escapes, then replaces variable values. By the time $test
gets replaced with One "This is two" Three
, it's too late for the quotes to have their intended effect.
The simple (but dangerous) way to do this is by adding another level of parsing with eval
:
$ test='One "This is two" Three'
$ eval "set -- $test"
$ echo "$2"
This is two
(Note that the quotes in the echo
command are not necessary, but are a good general practice.)
The reason I say this is dangerous is that it doesn't just go back and reparse for quoted strings, it goes back and reparses everything, maybe including things you didn't want interpreted like command substitutions. Suppose you had set
$ test='One `rm /some/important/file` Three'
...eval
will actually run the rm
command. So if you can't count on the contents of $test
to be "safe", do not use this construct.
BTW, the right way to do this sort of thing is with an array:
$ test=(One "This is two" Three)
$ set -- "${test[@]}"
$ echo "$2"
This is two
Unfortunately, this requires control of how the variable is created.
Now we have bash 4 where it's possible to do something like that:
#!/bin/bash
function qs_parse() {
readarray -t "$1" < <( printf "%s" "$2"|xargs -n 1 printf "%s\n" )
}
tab=' ' # tabulation here
qs_parse test "One 'This is two' Three -n 'foo${tab}bar'"
printf "%s\n" "${test[0]}"
printf "%s\n" "${test[1]}"
printf "%s\n" "${test[2]}"
printf "%s\n" "${test[3]}"
printf "%s\n" "${test[4]}"
Outputs, as expected:
One
This is two
Three
-n
foo bar # tabulation saved
Actually, I am not sure but it's probably possible to do that in older bash like that:
function qs_parse() {
local i=0
while IFS='' read -r line || [[ -n "$line" ]]; do
parsed_str[i]="${line}"
let i++
done < <( printf "%s\n" "$1"|xargs -n 1 printf "%s\n" )
}
tab=' ' # tabulation here
qs_parse "One 'This is two' Three -n 'foo${tab}bar'"
printf "%s\n" "${parsed_str[0]}"
printf "%s\n" "${parsed_str[1]}"
printf "%s\n" "${parsed_str[2]}"
printf "%s\n" "${parsed_str[3]}"
printf "%s\n" "${parsed_str[4]}"
The solution to this problem is to use xargs (eval free).
It retains double quoted strings together:
$ test='One "This is two" Three'
$ IFS=$'\n' arr=( $(xargs -n1 <<<"$test") )
$ printf '<%s>\n' "${arr[@]}"
<One>
<This is two>
<Three>
Of course, you can set the positional arguments with that array:
$ set -- "${arr[@]}"
$ echo "$2"
This is two
I wrote a couple native bash functions to do this: https://github.com/mblais/bash_ParseFields
You can use the ParseFields
function like this:
$ str='field1 field\ 2 "field 3"'
$ ParseFields -d "$str" a b c d
$ printf "|%s|\n|%s|\n|%s|\n|%s|\n" "$a" "$b" "$c" "$d"
|field1|
|field 2|
|field 3|
||
The -d
option to ParseFields removes any surrounding quotes and interprets backslashes from the parsed fields.
There is also a simpler ParseField
function (used by ParseFields
) that parses a single field at a specific offset within a string.
Note that these functions cannot parse a stream, only a string. The IFS variable can also be used to specify field delimiters besides whitespace.
If you require that unescaped apostrophes may appear in unquoted fields, that would require a minor change - let me know.