Best way to aggregate multiple log files from seve

2020-01-30 03:44发布

问题:

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 6 years ago.

I need a simple way to monitor multiple text log files distributed over a number of HP-UX servers. They are a mix of text and XML log files from several distributed legacy systems. Currently we just ssh to the servers and use tail -f and grep, but that doesn't scale when you have many logs to keep track of.

Since the logs are in different formats and just files in folders (automatically rotated when they reach a certain size) I need to both collect them remotely and parse each one differently.

My initial thought was to make a simple daemon process that I can run on each server using a custom file reader for each file type to parse it into a common format that can be exported over the network via a socket. Another viewer program running locally will connect to these sockets and show the parsed logs in some simple tabbed GUI or aggregated to a console.

What log format should I try to convert to if I am to implement it this way?

Is there some other easier way? Should I attempt to translate the log files to the log4j format to use with Chainsaw or are there better log viewers that can connect to remote sockets? Could I use BareTail as suggested in another log question? This is not a massivly distributed system and changing the current logging implementations for all applications to use UDP broadcast or put messages on a JMS queue is not an option.

回答1:

Options:

  1. Use a SocketAppender to send all logs to 1 server directly. (This could serverly hamper performance and add a single point of failure.)
  2. Use scripts to aggregate the data. I use scp, ssh, and authentication keys to allow my scripts to get data from all servers without any login prompts.


回答2:

Probably the lightest-weight solution for real-time log watching is to use Dancer's shell in concurrent mode with tail -f:

dsh -Mac -- tail -f /var/log/apache/*.log
  • The -a is for all machine names that you've defined in ~/.dsh/machines.list
  • The -c is for concurrent running of tail
  • The -M prepends the hostname to every line of output.


回答3:

We use a simple shell script like the one below. You'd, obviously, have to tweak it somewhat to tell it about the different file names and decide which box to look for which on but you get the basic idea. In our case we are tailing a file at the same location on multiple boxes. This requires ssh authentication via stored keys instead of typing in passwords.

#!/bin/bash
FILE=$1
for box in box1.foo.com box2.foo.com box3.foo.com box4.foo.com; do
     ssh $box tail -f $FILE &
done

Regarding Mike Funk's comment about not being able to kill the tailing with ^C, I store the above in a file called multitails.sh and appended the following to the end of it. This creates a kill_multitails.sh file which you run when you're done tailing, and then it deletes itself.

# create a bash script to kill off 
# all the tails when you're done
# run kill_multitails.sh when you're finished

echo '#!/bin/sh' > kill_multitails.sh
chmod 755 kill_multitails.sh
echo "$(ps -awx | grep $FILE)" > kill_multitails_ids
perl -pi -e 's/^(\d+).*/kill -9 $1/g' kill_multitails_ids
cat kill_multitails_ids >> kill_multitails.sh
echo "echo 'running ps for it'" >> kill_multitails.sh
echo "ps -awx | grep $FILE" >> kill_multitails.sh
echo "rm kill_multitails.sh" >> kill_multitails.sh
rm kill_multitails_ids


wait


回答4:

Logscape - like splunk without the price tag



回答5:

multitail or

"chip is a local and remote log parsing and monitoring tool for system admins and developers.
It wraps the features of swatch, tee, tail, grep, ccze, and mail into one, with some extras"

Eg.

chip -f -m0='RUN ' -s0='red' -m1='.*' -s1 user1@remote_ip1:'/var/log/log1 /var/log/log2 /var/log/log3 user2@remote_ip2:'/var/log/log1 /var/log/log2 /var/log/log3’' | egrep "RUN |==> /"

This will highlight in red the occurences of the -m0 pattern, pre-filtering the 'RUN |==> /' pattern from all the log files.



回答6:

I wrote vsConsole for exactly this purpose - easy access to log files - and then added app monitoring and version tracking. Would like to know what you think of it. http://vs-console.appspot.com/



回答7:

Awstats provides a perl script that can merge several apache log files together. This script scales well since the memory footprint is very low, logs files are never loaded in memory. I know that si not exactly what you needs, but perhaps you can start from this script and adapt it for your needs.



回答8:

You can use the various receivers available with Chainsaw (VFSLogFilePatternReceiver to tail files over ssh, SocketReceiver, UDPReceiver, CustomSQLDBReceiver, etc) and then aggregate the logs into a single tab by changing the default tab identifier or creating a 'custom expression logpanel' by providing an expression which matches the events in the various source tabs.



回答9:

gltail - real-time visualization of server traffic, events and statistics with Ruby, SSH and OpenGL from multiple servers



回答10:

XpoLog for Java