I would like to capture output from a UNIX process but limit max file size and/or rotate to a new file.
I have seen logrotate, but it does not work real-time. As I understand, it is a "clean-up" job that runs in parallel.
What is the right solution? I guess I will write a tiny script to do it, but I was hoping there was a simple way with existing text tools.
Imagine:
my_program | tee --max-bytes 100000 log/my_program_log
Would give...
Always writing latest log file as:
log/my_program_log
Then, as it fills... renamed to log/my_program_log000001 and start a new log/my_program_log.
use split:
my_program | tee >(split -d -b 100000 -)
Or if you don't want to see the output, you can directly pipe to split:
my_program | split -d -b 100000 -
As for the log rotation, there's no tool in coreutils that does it automatically. You could create a symlink and periodically update it using a bash command:
while ((1)); do ln -fns target_log_name $(ls -t | head -1); sleep 1; done
or using awk
program | awk 'BEGIN{max=100} {n+=length($0); print $0 > "log."int(n/max)}'
It keeps lines together, so the max is not exact, but this could be nice especially for logging purposes. You can use awk's sprintf to format the file name.
Here's a pipable script, using awk
#!/bin/bash
maxb=$((1024*1024)) # default 1MiB
out="log" # output file name
width=3 # width: log.001, log.002
while getopts "b:o:w:" opt; do
case $opt in
b ) maxb=$OPTARG;;
o ) out="$OPTARG";;
w ) width=$OPTARG;;
* ) echo "Unimplented option."; exit 1
esac
done
shift $(($OPTIND-1))
IFS='\n' # keep leading whitespaces
if [ $# -ge 1 ]; then # read from file
cat $1
else # read from pipe
while read arg; do
echo $arg
done
fi | awk -v b=$maxb -v o="$out" -v w=$width '{
n+=length($0); print $0 > sprintf("%s.%0.*d",o,w,n/b)}'
save this to a file called 'bee', run 'chmod +x bee
' and you can use it as
program | bee
or to split an existing file as
bee -b1000 -o proglog -w8 file
To limit the size to 100 bytes, you can simply use dd:
my_program | dd bs=1 count=100 > log
When 100 bytes are written, dd will close the pipe and my_program receives EPIPE.
In package apache2-utils
is present utility called rotatelogs
, it fully meet to your requirements.
Synopsis:
rotatelogs [ -l ] [ -L linkname ] [ -p program ] [ -f ] [ -t ] [ -v ] [ -e ] [ -c ] [ -n number-of-files ] logfile rotationtime|filesize(B|K|M|G) [ offset ]
Example:
your_program | rotatelogs -n 5 /var/log/logfile 1M
Full manual you may read on this link.
The most straightforward way to solve this is probably to use python and the logging module which was designed for this purpose. Create a script that read from stdin
and write to stdout
and implement the log-rotation described below.
The "logging" module provides the
class logging.handlers.RotatingFileHandler(filename, mode='a', maxBytes=0,
backupCount=0, encoding=None, delay=0)
which does exactly what you are asking about.
You can use the maxBytes and backupCount values to allow the file to rollover at a predetermined size.
From docs.python.org
Sometimes you want to let a log file grow to a certain size, then open a new file and log to that. You may want to keep a certain number of these files, and when that many files have been created, rotate the files so that the number of files and the size of the files both remain bounded. For this usage pattern, the logging package provides a RotatingFileHandler:
import glob
import logging
import logging.handlers
LOG_FILENAME = 'logging_rotatingfile_example.out'
# Set up a specific logger with our desired output level
my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)
# Add the log message handler to the logger
handler = logging.handlers.RotatingFileHandler(
LOG_FILENAME, maxBytes=20, backupCount=5)
my_logger.addHandler(handler)
# Log some messages
for i in range(20):
my_logger.debug('i = %d' % i)
# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)
for filename in logfiles:
print(filename)
The result should be 6 separate files, each with part of the log history for the application:
logging_rotatingfile_example.out
logging_rotatingfile_example.out.1
logging_rotatingfile_example.out.2
logging_rotatingfile_example.out.3
logging_rotatingfile_example.out.4
logging_rotatingfile_example.out.5
The most current file is always logging_rotatingfile_example.out, and each time it reaches the size limit it is renamed with the suffix .1. Each of the existing backup files is renamed to increment the suffix (.1 becomes .2, etc.) and the .6 file is erased.
Obviously this example sets the log length much much too small as an extreme example. You would want to set maxBytes to an appropriate value.
Another solution will be to use Apache rotatelogs utility.
Or following script:
#!/bin/ksh
#rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]
numberOfFiles=10
while getopts "n:fltvecp:L:" opt; do
case $opt in
n) numberOfFiles="$OPTARG"
if ! printf '%s\n' "$numberOfFiles" | grep '^[0-9][0-9]*$' >/dev/null; then
printf 'Numeric numberOfFiles required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
exit 1
elif [ $numberOfFiles -lt 3 ]; then
printf 'numberOfFiles < 3 %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$numberOfFiles" 1>&2
fi
;;
*) printf '-%s ignored. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$opt" 1>&2
;;
esac
done
shift $(( $OPTIND - 1 ))
pathToLog="$1"
fileSize="$2"
if ! printf '%s\n' "$fileSize" | grep '^[0-9][0-9]*[BKMG]$' >/dev/null; then
printf 'Numeric fileSize followed by B|K|M|G required %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
exit 1
fi
sizeQualifier=`printf "%s\n" "$fileSize" | sed "s%^[0-9][0-9]*\([BKMG]\)$%\1%"`
multip=1
case $sizeQualifier in
B) multip=1 ;;
K) multip=1024 ;;
M) multip=1048576 ;;
G) multip=1073741824 ;;
esac
fileSize=`printf "%s\n" "$fileSize" | sed "s%^\([0-9][0-9]*\)[BKMG]$%\1%"`
fileSize=$(( $fileSize * $multip ))
fileSize=$(( $fileSize / 1024 ))
if [ $fileSize -le 10 ]; then
printf 'fileSize %sKB < 10KB. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$fileSize" 1>&2
exit 1
fi
if ! touch "$pathToLog"; then
printf 'Could not write to log file %s. rotatelogs.sh -n numberOfFiles pathToLog fileSize[B|K|M|G]\n' "$pathToLog" 1>&2
exit 1
fi
lineCnt=0
while read line
do
printf "%s\n" "$line" >>"$pathToLog"
lineCnt=$(( $lineCnt + 1 ))
if [ $lineCnt -gt 200 ]; then
lineCnt=0
curFileSize=`du -k "$pathToLog" | sed -e 's/^[ ][ ]*//' -e 's%[ ][ ]*$%%' -e 's/[ ][ ]*/[ ]/g' | cut -f1 -d" "`
if [ $curFileSize -gt $fileSize ]; then
DATE=`date +%Y%m%d_%H%M%S`
cat "$pathToLog" | gzip -c >"${pathToLog}.${DATE}".gz && cat /dev/null >"$pathToLog"
curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[ ][ ]*//' -e 's%[ ][ ]*$%%' -e 's/[ ][ ]*/[ ]/g'`
while [ $curNumberOfFiles -ge $numberOfFiles ]; do
fileToRemove=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | head -1`
if [ -f "$fileToRemove" ]; then
rm -f "$fileToRemove"
curNumberOfFiles=`ls "$pathToLog".[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_[0-9][0-9][0-9][0-9][0-9][0-9].gz | wc -l | sed -e 's/^[ ][ ]*//' -e 's%[ ][ ]*$%%' -e 's/[ ][ ]*/[ ]/g'`
else
break
fi
done
fi
fi
done