Tool to extract java stack traces from log files [

2019-01-16 20:12发布

问题:

Is there any tool that can extract a list of stack traces appearing in the log file and probably count unique ones?

EDIT: I would preffer something that is not GUI-based and be run on the background and give some kind of report back. I have quite many logs gathered from several environments and just would like to get quick overview.

回答1:

Here is a quick-and-dirty grep expression... if you are using a logger such as log4j than the first line of the exception will generally contain WARN or ERROR, the next line will contain the Exception name, and optionally a message, and then the subsequent stack trace will begin with one of the following:

  1. "\tat" (tab + at)
  2. "Caused by: "
  3. "\t... <some number> more" (these are the lines that indicate the number of frames in the stack not shown in a "Caused by" exception)
  4. An Exception name (and perhaps message) before the stack

We want to get all of the above lines, so the grep expression is:

grep -P "(WARN|ERROR|^\tat |Exception|^Caused by: |\t... \d+ more)"

It assumes an Exception class always contains the word Exception which may or may not be true, but this is quick-and-dirty after all.

Adjust as necessary for your specific case.



回答2:

You can write this yourself pretty easily. Here is the pattern:

  1. Open file
  2. Search for the string "\n\tat " (that's new line, tab, at, blank) This is a pretty uncommon string outside of stack traces.

Now all you need to do is find the first line that doesn't start with \t to find the end of the stack trace. You may want to skip 1-3 lines after that to catch chained exceptions.

Plus add a couple of lines (say 10 or 50) before the first line of the stack trace to get some context.



回答3:

I wrote a tool in Python. It manages to split two stack traces even if they come right after each other in the log.

#!/usr/bin/env python
#
# Extracts exceptions from log files.
#

import sys
import re
from collections import defaultdict

REGEX = re.compile("(^\tat |^Caused by: |^\t... \\d+ more)")
# Usually, all inner lines of a stack trace will be "at" or "Caused by" lines.
# With one exception: the line following a "nested exception is" line does not
# follow that convention. Due to that, this line is handled separately.
CONT = re.compile("; nested exception is: *$")

exceptions = defaultdict(int)

def registerException(exc):
  exceptions[exc] += 1

def processFile(fileName):
  with open(fileName, "r") as fh:
    currentMatch = None
    lastLine = None
    addNextLine = False
    for line in fh.readlines():
      if addNextLine and currentMatch != None:
        addNextLine = False
        currentMatch += line
        continue
      match = REGEX.search(line) != None
      if match and currentMatch != None:
        currentMatch += line
      elif match:
        currentMatch = lastLine + line
      else:
        if currentMatch != None:
          registerException(currentMatch)
        currentMatch = None
      lastLine = line
      addNextLine = CONT.search(line) != None
    # If last line in file was a stack trace
    if currentMatch != None:
      registerException(currentMatch)

for f in sys.argv[1:]:
  processFile(f)

for item in sorted(exceptions.items(), key=lambda e: e[1], reverse=True):
  print item[1], ":", item[0]


回答4:

I have come up with the following Groovy script. It is, of course, very much adjusted to my needs, but I hope it helps someone.

def traceMap = [:]

// Number of lines to keep in buffer
def BUFFER_SIZE = 100

// Pattern for stack trace line
def TRACE_LINE_PATTERN = '^[\\s\\t]+at .*$'

// Log line pattern between which we try to capture full trace
def LOG_LINE_PATTERN = '^([<#][^/]|\\d\\d).*$'

// List of patterns to replace in final captured stack trace line 
// (e.g. replace date and transaction information that may make similar traces to look as different)
def REPLACE_PATTERNS = [
  '^\\d+-\\d+\\@.*?tksId: [^\\]]+\\]',
  '^<\\w+ \\d+, \\d+ [^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <',
  '^####<[^>]+?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <[^>]*?> <',
  '<([\\w:]+)?TransaktionsID>[^<]+?</([\\w:]+)?TransaktionsID>',
  '<([\\w:]+)?TransaktionsTid>[^<]+?</([\\w:]+)?TransaktionsTid>'
]

new File('.').eachFile { File file ->
  if (file.name.contains('.log') || file.name.contains('.out')) {
    def bufferLines = []
    file.withReader { Reader reader ->
      while (reader.ready()) {      
        def String line = reader.readLine()
        if (line.matches(TRACE_LINE_PATTERN)) {
          def trace = []
          for(def i = bufferLines.size() - 1; i >= 0; i--) {
            if (!bufferLines[i].matches(LOG_LINE_PATTERN)) {
              trace.add(0, bufferLines[i])
            } else {
              trace.add(0, bufferLines[i])
              break
            }
          }
          trace.add(line)
          if (reader.ready()) {
            line = reader.readLine()
            while (!line.matches(LOG_LINE_PATTERN)) {
              trace.add(line)
              if (reader.ready()) {
                line = reader.readLine()
              } else {
                break;
              }
            }
          }
          def traceString = trace.join("\n")
          REPLACE_PATTERNS.each { pattern ->
            traceString = traceString.replaceAll(pattern, '')
          }
          if (traceMap.containsKey(traceString)) {
            traceMap.put(traceString, traceMap.get(traceString) + 1)
          } else {
            traceMap.put(traceString, 1)
          }
        }
        // Keep the buffer of last lines.
        bufferLines.add(line)
        if (bufferLines.size() > BUFFER_SIZE) {
          bufferLines.remove(0)
        }
      }
    }
  }
}

traceMap = traceMap.sort { it.value }

traceMap.reverseEach { trace, number ->
  println "-- Occured $number times -----------------------------------------"
  println trace
}


回答5:

Here's nice code that does the same - http://www.techiedelight.com/java-program-search-exceptions-huge-log-file-on-server/

It basically reads the log file line by line and search for keyword “Exception” in each line. Once found, it will print the next 10 lines (exception trace) in a separate output file.



回答6:

I use Baretail.