I need a shells script which could parse incoming date values and print this in a standard format.
Incoming date patterns are:
"yyyyMMdd HHmmss"
"yyyyMMdd_HHmmss"
"MMddyyyy:HHmmss"
"MMddyyyyHHmmssmillisecond"
20170426 102300
20170426_102300
04262017:102300
0426201710230066
Output date pattern :
yyyyMMdd_HHmmSS
20170426_102300
Any idea how to achieve this result in bash. I tried couple of regex for getting result , but that didn't help.
Any help is appreciated.
Pipe the input to sed:
sed -re 's/([0-9]{8}) ([0-9]{6})/\1_\2/' -e 's/([0-9]{4})([0-9]{4}):?([0-9]{6}).*/\2\1_\3/'
Is perl acceptable?
while (<>) {
if ($_ =~ m/(\d{8})[ _](\d{6})/) {
print "$1_$2";
} elsif ($_ =~ m/(\d{4})(\d{4}):(\d{6})/) {
print "$2$1_$3";
} elsif ($_ =~ m/(\d{4})(\d{4})(\d{6})\d*/) {
print "$2$1_$3";
}
}
Havn't tested it though...
You could also use it like this:
~$ cat data | perl -e 'while (<>) {
if ($_ =~ m/(\d{8})[ _](\d{6})/) {
print "$1_$2";
} elsif ($_ =~ m/(\d{4})(\d{4}):(\d{6})/) {
print "$2$1_$3";
} elsif ($_ =~ m/(\d{4})(\d{4})(\d{6})\d*/) {
print "$2$1_$3";
}
}'
For your array this may be acceptable:
~$ perl -e 'for (@ARGV) {
if ($_ =~ m/(\d{8})[ _](\d{6})/) {
print "$1_$2\n";
} elsif ($_ =~ m/(\d{4})(\d{4}):(\d{6})/) {
print "$2$1_$3\n";
} elsif ($_ =~ m/(\d{4})(\d{4})(\d{6})\d*/) {
print "$2$1_$3\n";
} else { print "$_ does not fit\n"; }
}' "${testdata[@]}"
If you don't have perl on your production-environment, you probably want to settle for a sed
solution.
I suggest the one from Walter A:
for t in "${testdata[@]}"; do
echo $t | sed -re 's/([0-9]{4})([0-9]{4})([0-9:])/\2\1\3/; s/[ _:]//;s/(.{8})(.{6}).*/\1_\2/';
done
For fun, here is a solution using awk
:
awk 'NF==2{print $1"_"$2} $1~"_"{print $1} $1~":"{print gensub(/([0-9]{4})([0-9]{4}):([0-9]{6})/, "\\2\\1_\\3", "g", $1)} length($1)==16{print gensub(/([0-9]{4})([0-9]{4})([0-9]{6}).*/, "\\2\\1_\\3", "g", $1)}'
Pretty much the same as the perl
and sed
examples. Testing and regex replacing.
First make all dates in the format yyyyMMdd
sed -r 's/([0-9]{4})([0-9]{4})([0-9:])/\2\1\3/'
Next remove the optional character between day and hour
sed -r 's/([0-9]{4})([0-9]{4})([0-9:])/\2\1\3/; s/[ _:]//'
Change yyyyMMddHHmmss?? into desired format
sed -r 's/([0-9]{4})([0-9]{4})([0-9:])/\2\1\3/; s/[ _:]//;s/(.{8})(.{6}).*/\1_\2/'
EDIT:
I first tried to show msec, but those were not needed:
# INCORRECT SOLUTION
# sed -r 's/([0-9]{4})([0-9]{4})([0-9:])/\2\1\3/; s/[ _:]//; s/$/00/' | cut -c1-16