awk only works on copied data, why?

2019-07-28 08:22发布

问题:

I have a somewhat straightforward awk used for the purpose described here:

Append multiple header information fields to file until next header found

The awk only works on the data after I copy/paste it into a new file. If I direct the output of head into a new file, the awk still does not work, for instance. The awk only works if I copy/paste the file into a new file.

`head -40 file.csv > output.csv`

This is the awk:

`awk -F, '/"Serial No."/ {sn = $2} 
     /"Location:"/  {loc = $2} 
     /"([0-9]{1,2}\/){2}[0-9]{4} [0-9]{2}:[0-9]{2}"/ 
                    {$0 = loc FS sn FS $0}1' file.csv>master1.csv`

If I copy/paste the data and compare it to the original data, the output indicates a difference in every single line, but does not say where. If you look at a diff between a head output and a copy/paste files you get:

`diff trap4_top.csv trap4_again.csv'

:

 < 1,25c1,24

 < "Serial No.","0700000036022821"

 < "Location:","LS_trap_2c"
 < "High temperature limit (�C)",20
 < "Low temperature limit (�C)",0
 < "Date - Time","Temperature (�C)"
 < "5/28/2015 08:00",24.0
 < "5/28/2015 10:00",29.5
 < "5/28/2015 12:00",28.0
 < "5/28/2015 14:00",28.5
 < "5/28/2015 16:00",27.0
 < "5/28/2015 18:00",24.5
 < "5/28/2015 20:00",23.0
 < "5/28/2015 22:00",22.5
 < "5/29/2015 00:00",21.5
 < "5/29/2015 02:00",21.0
 < "5/29/2015 04:00",20.0
 < "5/29/2015 06:00",20.0
 < "5/29/2015 08:00",24.5
 < "5/29/2015 10:00",26.0
 < "5/29/2015 12:00",27.5
 < "5/29/2015 14:00",30.0
 < "5/29/2015 16:00",29.0
 < "5/29/2015 18:00",25.5
 < "5/29/2015 20:00",23.5
 < "5/29/2015 22:00",23.0
 ---
 > "Serial No.","0700000036022821"
 > "Location:","LS_trap_2c"
 > "High temperature limit (°C)",20
 > "Low temperature limit (°C)",0
 > "Date - Time","Temperature (°C)"
 > "5/28/2015 08:00",24.0
 > "5/28/2015 10:00",29.5
 > "5/28/2015 12:00",28.0
 > "5/28/2015 14:00",28.5
 > "5/28/2015 16:00",27.0
 > "5/28/2015 18:00",24.5
 > "5/28/2015 20:00",23.0
 > "5/28/2015 22:00",22.5
 > "5/29/2015 00:00",21.5
 > "5/29/2015 02:00",21.0
 > "5/29/2015 04:00",20.0
 > "5/29/2015 06:00",20.0
 > "5/29/2015 08:00",24.5
 > "5/29/2015 10:00",26.0
 > "5/29/2015 12:00",27.5
 > "5/29/2015 14:00",30.0
 > "5/29/2015 16:00",29.0
 > "5/29/2015 18:00",25.5
 > "5/29/2015 20:00",23.5`

I see special characters in the diff but I'm not if they're involved, or how exactly to remove them, other than copy/paste so far.

head trap4.csv | cat -vte

gives:

"Serial No.","0700000036022821"^M$
"Location:","LS_trap_2c"^M$
"High temperature limit (M-0C)",20^M$
"Low temperature limit (M-0C)",0^M$
"Date - Time","Temperature (M-0C)"^M$
"5/28/2015 08:00",24.0^M$
"5/28/2015 10:00",29.5^M$
"5/28/2015 12:00",28.0^M$
"5/28/2015 14:00",28.5^M$
"5/28/2015 16:00",27.0^M$

回答1:

Alright so as I suspected your input file has DOS line endings i.e. \r or ^M (as shown above).

You should convert your input file to unix line endings by running:

dos2unix file.csv

Otherwise you can do:

head -40 file.csv | sed 's/\r//' | awk ...