可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I\'d like to execute an gawk script with --re-interval
using a shebang. The \"naive\" approach of
#!/usr/bin/gawk --re-interval -f
... awk script goes here
does not work, since gawk is called with the first argument \"--re-interval -f\"
(not splitted around the whitespace), which it does not understand. Is there a workaround for that?
Of course you can either not call gawk directly but wrap it into a shell script that splits the first argument, or make a shell script that then calls gawk and put the script into another file, but I was wondering if there was some way to do this within one file.
The behaviour of shebang lines differs from system to system - at least in Cygwin it does not split the arguments by whitespaces. I just care about how to do it on a system that behaves like that; the script is not meant to be portable.
回答1:
This seems to work for me with (g)awk.
#!/bin/sh
arbitrary_long_name==0 \"exec\" \"/usr/bin/gawk\" \"--re-interval\" \"-f\" \"$0\" \"$@\"
# The real awk program starts here
{ print $0 }
Note the #!
runs /bin/sh
, so this script is first interpreted as a shell script.
At first, I simply tried \"exec\" \"/usr/bin/gawk\" \"--re-interval\" \"-f\" \"$0\" \"$@\"
, but awk treated that as a command and printed out every line of input unconditionally. That is why I put in the arbitrary_long_name==0
- it\'s supposed to fail all the time. You could replace it with some gibberish string. Basically, I was looking for a false-condition in awk that would not adversely affect the shell script.
In the shell script, the arbitrary_long_name==0
defines a variable called arbitrary_long_name
and sets it equal to =0
.
回答2:
The shebang line has never been specified as part of POSIX, SUS, LSB or any other specification. AFAIK, it hasn\'t even been properly documented.
There is a rough consensus about what it does: take everything between the !
and the \\n
and exec
it. The assumption is that everything between the !
and the \\n
is a full absolute path to the interpreter. There is no consensus about what happens if it contains whitespace.
- Some operating systems simply treat the entire thing as the path. After all, in most operating systems, whitespace or dashes are legal in a path.
- Some operating systems split at whitespace and treat the first part as the path to the interpreter and the rest as individual arguments.
- Some operating systems split at the first whitespace and treat the front part as the path to the interpeter and the rest as a single argument (which is what you are seeing).
- Some even don\'t support shebang lines at all.
Thankfully, 1. and 4. seem to have died out, but 3. is pretty widespread, so you simply cannot rely on being able to pass more than one argument.
And since the location of commands is also not specified in POSIX or SUS, you generally use up that single argument by passing the executable\'s name to env
so that it can determine the executable\'s location; e.g.:
#!/usr/bin/env gawk
[Obviously, this still assumes a particular path for env
, but there are only very few systems where it lives in /bin
, so this is generally safe. The location of env
is a lot more standardized than the location of gawk
or even worse something like python
or ruby
or spidermonkey
.]
Which means that you cannot actually use any arguments at all.
回答3:
I came across the same issue, with no apparent solution because of the way the whitespaces are dealt with in a shebang (at least on Linux).
However, you can pass several options in a shebang, as long as they are short options and they can be concatenated (the GNU way).
For example, you can not have
#!/usr/bin/foo -i -f
but you can have
#!/usr/bin/foo -if
Obviously, that only works when the options have short equivalents and take no arguments.
回答4:
Under Cygwin and Linux everything after the path of the shebang gets parsed to the program as one argument.
It\'s possible to hack around this by using another awk
script inside the shebang:
#!/usr/bin/gawk {system(\"/usr/bin/gawk --re-interval -f \" FILENAME); exit}
This will execute {system(\"/usr/bin/gawk --re-interval -f \" FILENAME); exit}
in awk.
And this will execute /usr/bin/gawk --re-interval -f path/to/your/script.awk
in your systems shell.
回答5:
#!/bin/sh
\'\'\':\'
exec YourProg -some_options \"$0\" \"$@\"
\'\'\'
The above shell shebang trick is more portable than /usr/bin/env
.
回答6:
In the gawk manual (http://www.gnu.org/manual/gawk/gawk.html), the end of section 1.14 note that you should only use a single argument when running gawk from a shebang line. It says that the OS will treat everything after the path to gawk as a single argument. Perhaps there is another way to specify the --re-interval
option? Perhaps your script can reference your shell in the shebang line, run gawk
as a command, and include the text of your script as a \"here document\".
回答7:
Why not use bash
and gawk
itself, to skip past shebang, read the script, and pass it as a file to a second instance of gawk [--with-whatever-number-of-params-you-need]
?
#!/bin/bash
gawk --re-interval -f <(gawk \'NR>3\' $0 )
exit
{
print \"Program body goes here\"
print $1
}
(-the same could naturally also be accomplished with e.g. sed
or tail
, but I think there\'s some kind of beauty depending only on bash
and gawk
itself;)
回答8:
Although not exactly portable, starting with coreutils 8.30 and according to its documentation you will be able to use:
#!/usr/bin/env -S command arg1 arg2 ...
So given:
$ cat test.sh
#!/usr/bin/env -S showargs here \'is another\' long arg -e \"this and that \" too
you will get:
% ./test.sh
$0 is \'/usr/local/bin/showargs\'
$1 is \'here\'
$2 is \'is another\'
$3 is \'long\'
$4 is \'arg\'
$5 is \'-e\'
$6 is \'this and that \'
$7 is \'too\'
$8 is \'./test.sh\'
and in case you are curious showargs
is:
#!/usr/bin/env sh
echo \"\\$0 is \'$0\'\"
i=1
for arg in \"$@\"; do
echo \"\\$$i is \'$arg\'\"
i=$((i+1))
done
Original answer here.
回答9:
Just for fun: there is the following quite weird solution that reroutes stdin and the program through file descriptors 3 and 4. You could also create a temporary file for the script.
#!/bin/bash
exec 3>&0
exec <<-EOF 4>&0
BEGIN {print \"HALLO\"}
{print \\$1}
EOF
gawk --re-interval -f <(cat 0>&4) 0>&3
One thing is annoying about this: the shell does variable expansion on the script, so you have to quote every $ (as done in the second line of the script) and probably more than that.
回答10:
For a portable solution, use awk
rather than gawk
, invoke the standard BOURNE shell (/bin/sh
) with your shebang, and invoke awk
directly, passing the program on the command line as a here document rather than via stdin:
#!/bin/sh
gawk --re-interval <<<EOF
PROGRAM HERE
EOF
Note: no -f
argument to awk
. That leaves stdin
available for awk
to read input from. Assuming you have gawk
installed and on your PATH
, that achieves everything I think you were trying to do with your original example (assuming you wanted the file content to be the awk script and not the input, which I think your shebang approach would have treated it as).