I have file with values in below format-
datecolumn1 datecolumn2
20140202 20180113
20140202 20180113
20140202 20180113
20131202 20180113
20140331 20180113
I want to subtract $3-$2 to find total number of days. output will be as below-
20180113 20180115 3.
please help with awk command.
Following awk
may help you in same, to get the difference of days between dates(format which you have posted in your post). Also this code is created and tested in GNU awk
.
TZ=PST awk '
FNR>1{
split($1,date,"");
split($2,date1,"");
print (mktime(date1[1] date1[2] date1[3] date1[4]" " date1[5] date1[6]" " date1[7] date1[8] " " "00 00 00")\
-mktime(date[1] date[2] date[3] date[4]" " date[5] date[6]" " date[7] date[8] " " "00 00 00"))\
/(24*60*60);
}' Input_file
Using GNUawk, mkdate
and FIELDWIDTHS
for separating the the date parts. As $4
is a space, it is abused as a space in mktime
as it is shorter to write than " "
(:
$ awk '
BEGIN{ FIELDWIDTHS="4 2 2 1 4 2 2" }
{
print $0,(mktime($5 $4 $6 $4 $7 " 0 0 0")-mktime($1 $4 $2 $4 $3 " 0 0 0"))/86400
}' file
datecolumn1 datecolumn2 0
20140202 20180113 1441
20140202 20180113 1441
20140202 20180113 1441
20131202 20180113 1503
20140331 20180113 1384.04 # wut?
20180113 20180115 2
(I was not really sure if 20180113 20180115
should produce 3
as in the example or 2
as math dictates.)
$ cat tst.awk
function secs(d) { return mktime(substr(d,1,4)" "substr(d,5,2)" "substr(d,7)" 0 0 0") }
{ print $1, $2, int((secs($2) - secs($1)) / (24 * 60 * 60)) }
$ awk -f tst.awk file
datecolumn1 datecolumn2 0
20140202 20180113 1441
20140202 20180113 1441
20140202 20180113 1441
20131202 20180113 1503
20140331 20180113 1384
Note that the above is an approximation to the number of days difference but it does not account accurately for DST. I just used it since you seem happy with that approximation but if you wanted the result to be accurate you'd need a different time calculation that would involve using strftime("%j")
for the start date, the last day of that year minus that value (to get the number of days remaining that year), the last day of every year between the start and end dates, and the day of the year for the end date.