why is a double-quoted awk command substitution fa

2020-05-01 07:32发布

问题:

Using C shell, the following command-line

set pf = "`awk -v var=$pd '{if($1<0) print var, $2, $3}' test.txt`"

returns an error in awk:

awk: {if( <0) print var, , } syntax error. 

This is especially puzzling as the command itself works without any problem:

awk -v var=$pd '{if($1<0) print var, $2, $3}' test.txt

Is there a way that we can store all output of the single Awk command line into a single variable? What is the reason the above is failing?

回答1:

After some tinkering, I can only come to the conclusion that it is one of those C-Shell quirks. C-shell (csh or tcsh) is apparently notoriously known for its peculiarities, and I believe that this is exactly what is going on here. Here are some examples, based on the OP's enigma.

unquoted:

$ set a = `echo a_b c | awk '{print $1}'` ; echo $a
a_b
$ set a = `echo a_b c | awk '{print $1,2}'` ; echo $a
a_b 2
$ set a = `echo a_b c | awk '{print $1 OFS 2}'` ; echo $a
a_b 2

quoted:

$ set a = "`echo a_b c | awk '{print $1}'`" ; echo $a
a_b c
$ set a = "`echo a_b c | awk '{print $1,2}'`" ; echo $a
awk: cmd. line:1: {print ,2}
awk: cmd. line:1:        ^ syntax error
$ set a = "`echo a_b c | awk '{print $1 OFS 2}'`" ; echo $a
2

So, in the double-quoted examples, it looks like $1 is replaced by an empty string. This explains why the first case prints the full line a_b c and the third just the number 2. The second fails as the Awk statement print ,2 is invalid while the first works as print is equivalent to print $0 in Awk.

If you play a bit more, you actually notice that C-shell tries to do variable substitution. You actually do not need to use set in all the above, just a simple double-quoted command substitution. The following example shows completely how bananas this is:

$ echo $0
csh
$ echo "`echo a_b c | awk '{print $0}'`"

$ echo "`echo a_b c | awk -v csh=foo '{print $0}'`"
foo
$ echo `echo a_b c | awk -v csh=foo '{print $0}'`
a_b c

So from this, you see that C-Shell is performing the variable substitutions and $0 is being replaced with the string csh. But only in the double-quoted version!

So, why is this the case? The reason is the double-quotes. A double-quoted string allows variable substitution, disregarding the usage of nested quotes within the double-quoted string. So, even though, the Awk line is single-quoted in a back-wards quoted string, the double-quotes still will do the variable substitution on $n. This is in contrast to Bash:

$ echo $0
bash
$ echo "`echo a_b c | awk -v csh=foo '{print $0}'`"
a_b c

Lexical structure

Furthermore, all Substitutions (see below) except History substitution can be prevented by enclosing the strings (or parts of strings) in which they appear with single quotes or by quoting the crucial character(s) (e.g., $ or ` for Variable substitution or Command substitution respectively) with \. (Alias substitution is no exception: quoting in any way any character of a word for which an alias has been defined prevents substitution of the alias. The usual way of quoting an alias is to precede it with a backslash.) History substitution is prevented by backslashes but not by single quotes. Strings quoted with double or backward quotes undergo Variable substitution and Command substitution, but other substitutions are prevented.

Quoting complex strings, particularly strings which themselves contain quoting characters, can be confusing. Remember that quotes need not be used as they are in human writing! It may be easier to quote not an entire string, but only those parts of the string which need quoting, using different types of quoting to do so if appropriate.

source: man csh

So, how can this be solved? While C-shell is completely non-intuitive and mentally a quoting nightmare, it is possible to fix the problem by terminating the quoting early and change from double to single-quotes for a short time.

$ echo "`echo a_b c | awk -v csh=foo '{print "'$0,$1'"}'`"
a_b c a_b
$ echo "`echo a_b c | awk -v csh=foo '"'{print $0,$1}'"'`"
a_b c a_b

Solution: So after all this, I think we can conclude that

$ set pf = "`awk -v var=$pd '"'{if($1<0) print var, $2, $3}'"' test.txt`"

could potentially solve the problem.



标签: awk csh tcsh