Unexpected new line when writing out in Unix Shell

2019-08-05 18:35发布

Hello I am trying to output a CSV file and I keep getting part of my string written onto a new line.

The overall script reads in a CSV file, which has a time stamp, converts it and then appends the Epoch time to the end of the line as a new variable and outputs the filem.

#!/bin/bash 
OLDIFS=$IFS 
IFS=","
cat test.csv | while read Host AName Resource MName TimeStamp Integer_Value Epoch; 
do 

Epoch=$(date -d "$TimeStamp GMT" +%s)

if [ -z "$Epoch" ]
then
    (echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, Epoch,";) >> target.csv

else
    (echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, $Epoch,";) >> target.csv

fi

done

I am trying to set a header then write out the appended variable, expect, and this only happens on the new value, it drops the appended variable to a new line.

#Host, AName, Resource, MName, Actual Start Time, Integer Value
, Epoch,
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00.0, 0
, 1388998800,

Instead of

#Host, AName, Resource, MName, Actual Start Time, Integer Value, Epoch,
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00.0, 0, 1388998800,

When I move the order around it doesn't happen. Sorry I know this is probably simple I new to Unix scripting.

EDIT

I have now changed the code to:

#!/bin/bash 
OLDIFS=$IFS 
IFS=","
while read Host AName Resource MName TimeStamp Integer_Value Epoch
do 

Epoch=$(date -d "$TimeStamp GMT" +%s)

if [ -z "$Epoch" ]
then
    echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, Epoch,"

else
    echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, $Epoch,"

fi

done < test.csv  > target.csv

And i am still getting the same problems.

also as an additional question if anyone knows why I get : command not found date: invalid date `Actual Start TimeStamp GMT' when running the date part but it produces the correct date and the scripts run.

标签: linux shell unix
2条回答
手持菜刀,她持情操
2楼-- · 2019-08-05 18:55

Try this script:

IFS=[,$'\r']; while read Host AName Resource MName TimeStamp Integer_Value Epoch
do
   # ignore first line with headers
   [[ "$Host" == \#* ]] && continue

   Epoch=$(date -d "$TimeStamp GMT" +%s)

   if [ -z "$Epoch" ]; then
     echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, Epoch,"
   else
     echo "$Host, $AName, $Resource, $MName, $TimeStamp, $Integer_Value, $Epoch,"    
   fi
done < test.csv > target.csv

It does 2 things differently:

  1. It treats \r as field separator and doesn't include that in read variables
  2. It ignores your first line that is header of input CSV file
查看更多
干净又极端
3楼-- · 2019-08-05 19:10

I would personally use awk here is how:

awk  -F"," '{timestamp=$5;  gsub(":"," ",timestamp); gsub("-"," ",timestamp);   EPOCH=(mktime(timestamp)*1000)} {print $0","EPOCH}' 1.csv 

Produces:

ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0,1388998800000

A 1 liner that does all of what you need:

So long as your time stamps are in that exact format then the gsum in the awk is remvoving : and - from the date format passing it to mktime to produce timestamp in seconds and finally printing each line entirely $0","EPOCH which is now converted time value

 awk  -F"," '{ 
     timestamp=$5;  
     gsub(":"," ",timestamp); 
     gsub("-"," ",timestamp);   
     EPOCH=(mktime(timestamp)*1000)
     } 
     {
      print $0","EPOCH
      }' your_File.csv 

Here it is expanded.

Now to expand this so that you read this file parse it through awk and then pump the output back into the same file you could something like this:

cp 2.csv 1.csv
cat 1.csv 
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
file="1.csv"; output=$(awk  -F"," '{ timestamp=$5;gsub(":"," ",timestamp);gsub("-"," ",timestamp);EPOCH=(mktime(timestamp));}{print $0", "EPOCH;}' $file 2>&1);  echo "$output" > $file
cat 1.csv 
ABCD89A, Admi , shop, Stall Cou t, 2014-01-06 09:00:00, 0, 1388998800
ABCD89N, Admi , shop, Stall Cou t, 2014-01-06 09:00:00, 0, 1388998800

Now to expand this method so that you ensure you do not overwrite the same file which has already been set with the time in seconds you could run something like this:

cp 2.csv 1.csv
 cat $file
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0
 file="1.csv"; output=$(awk  -F"," '{ if (NF==7) { print "ERROR"; next; }else{timestamp=$5;gsub(":"," ",timestamp);gsub("-"," ",timestamp);EPOCH=(mktime(timestamp));}{print $0", "EPOCH;}}' $file 2>&1); if echo "$output"|grep -q "ERROR"; then  echo "$output"; else echo "$output" > $file; fi
 file="1.csv"; output=$(awk  -F"," '{ if (NF==7) { print "ERROR"; next; }else{timestamp=$5;gsub(":"," ",timestamp);gsub("-"," ",timestamp);EPOCH=(mktime(timestamp));}{print $0", "EPOCH;}}' $file 2>&1); if echo "$output"|grep -q "ERROR"; then  echo "$output"; else echo "$output" > $file; fi
ERROR
ERROR
 cat $file
ABCD89A, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0, 1388998800
ABCD89N, Admin, shop, Stall Count, 2014-01-06 09:00:00, 0, 1388998800

You will notice on the 2nd run it outputs ERROR and does not actually overwrite the same file...

This way you could automate some script to come along and do this and feel safe that it won't add extra stuff to existing csv's

Or use a temp file for massive csv files: This is a pointless line I Was only testing if I could tee back into the same file which I found worked on odd occasions - really bizzare.

(awk  -F"," '{ timestamp=$5;gsub(":"," ",timestamp);gsub("-"," ",timestamp);EPOCH=(mktime(timestamp));}{print $0", "EPOCH;}' 1.csv 2>&1|tee /tmp/a; mv /tmp/a 1.csv;)

since this could have just been

(awk  -F"," '{ timestamp=$5;gsub(":"," ",timestamp);gsub("-"," ",timestamp);EPOCH=(mktime(timestamp));}{print $0", "EPOCH;}' 1.csv >/tmp/a; mv /tmp/a 1.csv;)

The first method using $output stores the csv into memory as a variable and then pushes back into the file. The second or last method probably the very last attempt of the /tmp file uses a temp file to process. The method you choose I guess could depend on the size of your CSV file. If we are talking gigs and its not a very powerful machine then temp files is the way to go. The memory is obviously cleaner and should be fastest of all.

Its just my input on this - it may come in handy for someone else wishing to do something similar

查看更多
登录 后发表回答