Different awk results on Linux and mingw64 with CR

2019-04-15 08:45发布

问题:

On Linux:

echo -n $'boo\r\nboo\r\n' | awk $'BEGIN { RS="\\n" } {gsub("boo","foo"); print}' | cat -v

returns the expected

foo^M
foo^M

However, on mingw64 (git bash for windows) the same command returns:

foo
foo

without the carriage returns.

I tried setting the record separator explicitly since maybe the default was different between the two platforms, but awk on mingw64 is still chewing up the carriage returns. How can I made awk do the same thing on Linux on mingw64? Note the awk versions are slightly different (GNU Awk 4.0.2 on Linux and GNU Awk 4.2.1 on mingw64), but I wouldn't expect this to matter unless there is some kind of bug.

Note that something is happening specifically in awk since on mingw64 this:

echo -n $'boo\r\nboo\r\n' | cat -v

returns the expected:

boo^M
boo^M

回答1:

After searching a while, I found this question, And from this answer :

it's something done by the C libraries and to stop it happening you should set the awk BINMODE variable to 3

I changed your code to:

echo -n $'boo\r\nboo\r\n' | awk -v BINMODE=3 $'BEGIN { RS="\\n" } {gsub("boo","foo"); print}' | cat -v

And tried it on Unix, Linux, MacOS, and Windows, all produce this output:

foo^M
foo^M

So -v BINMODE=3 is what you are looking for.
NOTE that only -v BINMODE=3 this switch & before code way working.
Usually we can pass variable to awk by -v switch, in BEGIN block, or set it after code before files,
but in this case I tried the three ways, only -v BINMODE=3 works.
Guess it's something to do with awk's compiling process.

Example (under cygwin on Windows):

$ echo -n $'boo\r\nboo\r\n' | awk -v BINMODE=3 '1' | cat -v    
boo^M                                                          
boo^M                                                          

$ echo -n $'boo\r\nboo\r\n' | awk 'BEGIN{BINMODE=3}1' | cat -v 
boo                                                            
boo                                                            

$ echo -n $'boo\r\nboo\r\n' | awk '1' BINMODE=3 | cat -v       
boo                                                            
boo                  

Under other mentioned platforms, they all produce:

boo^M
boo^M