I have a gigabytes-large log file of in this format:
2016-02-26 08:06:45 Blah blah blah
I have a log parser which splits up the single file log into separate files according to date while trimming the date from the original line.
I do want some form of tee
so that I can see how far along the process is.
The problem is that this method is mind numbingly slow. Is there no way to do this quickly in bash? Or will I have to whip up a little C program to do it?
log_file=server.log
log_folder=logs
mkdir $log_folder 2> /dev/null
while read a; do
date=${a:0:10}
echo "${a:11}" | tee -a $log_folder/$date
done < <(cat $log_file)
Try this awk solution - it should be pretty fast - it shows progress - only one file is kept open - also writes lines that don't start with a date to the current date file so lines are not lost - a default initial date is set to "0000-00-00" in case log starts with lines without dates
any timing comparison would be much appreciated
sample input log
output
read
in bash is absurdly slow. You can make it faster, but you can probably get more speed up with awk:If you really want to print to stdout as well, you can, but if that's going to a tty it is going to slow things down a lot. Just use:
(Note the "1" after the closing brace.)