Inspecting Java threads in Linux using top

2019-01-31 03:44发布

问题:

I am inspecting a Java process in Linux using

top -H

However, I cannot read the name of the thread in the "COMMAND" column (because it is too long). If I use 'c' to expand the full name of the process, then it is still to long to fit.

How can I obtain the full name of the command?

回答1:

You can inspect java threads with the tool jstack. It will list the names, stacktraces and other useful information of all threads belonging to the specified process pid.

Edit: The parameter nid in the thread dump of jstack is the hex version of the LWP that is displayed by top in the pid column for threads.



回答2:

This might be a little old, but here's what I did to kinda merge top and jstack together. I used two scripts, but I'm sure it all could be done in one.

First, I save the output of top with the pids for my java threads into a file and save the jstack output into another file:

#!/bin/sh
top -H -b -n 1 | grep java > /tmp/top.log
jstack -l `ps fax | grep java | grep tomcat | sed "s/ *\([0-9]*\) .*/\1/g"` > /tmp/jstack.log

Then I use a perl script to call the bash script (called cpu-java.sh here) and kinda merge the two files (/tmp/top.log and /tmp/jstack.log):

#!/usr/bin/perl
system("sh cpu-java.sh");
open LOG, "/tmp/top.log" or die $!;
print "PID\tCPU\tMem\tJStack Info\n";
while ($l = <LOG>) {
    $pid = $l;
    $pid =~ s/root.*//g;
    $pid =~ s/ *//g;
    $hex_pid = sprintf("%#x", $pid);
    @values = split(/\s{2,}/, $l);
    $pct = $values[4];
    $mem = $values[5];
    open JSTACK, "/tmp/jstack.log" or die $!;   
    while ($j = <JSTACK>){
        if ($j =~ /.*nid=.*/){
            if ($j =~ /.*$hex_pid.*/){
                $j =~ s/\n//;
                $pid =~ s/\n//;
                print $pid . "\t" . $pct . "\t" . $mem . "\t" .  $j . "\n";
            }
        }
    }   
    close JSTACK;
}
close LOG;

The output helps me to find out which threads are hogging my cpu:

PID     CPU Mem JStack Info
22460   0   8.0 "main" prio=10 tid=0x083cb800 nid=0x57bc runnable [0xb6acc000]
22461   0   8.0 "GC task thread#0 (ParallelGC)" prio=10 tid=0x083d2c00 nid=0x57bd runnable 
22462   0   8.0 "GC task thread#1 (ParallelGC)" prio=10 tid=0x083d4000 nid=0x57be runnable 
22463   0   8.0 "GC task thread#2 (ParallelGC)" prio=10 tid=0x083d5800 nid=0x57bf runnable 
22464   0   8.0 "GC task thread#3 (ParallelGC)" prio=10 tid=0x083d7000 nid=0x57c0 runnable
...

Then I can go back to /tmp/jstack.log and take a look at the stack trace for the problematic thread and try to figure out what's going on from there. Of course this solution is platform-dependent, but it should work with most flavors of *nix and some tweaking here and there.



回答3:

I have created a top-like command specifically for visualizing Java threads ordered by CPU usage and posted the source code at: https://github.com/jasta/jprocps. The command-line syntax is not nearly as rich as top, but it does support some of the same commands:

$ jtop -n 1

Sample output (showing ant and IntelliJ running):

  PID   TID USER       %CPU  %MEM  THREAD
13480 13483 jasta      104   2.3   main
13480 13497 jasta      86.3  2.3   C2 CompilerThread1
13480 13496 jasta      83.0  2.3   C2 CompilerThread0
 4866  4953 jasta      1.0   13.4  AWT-EventQueue-1 12.1.4#IC-129.713, eap:false
 4866 14154 jasta      0.9   13.4  ApplicationImpl pooled thread 36
 4866  5219 jasta      0.8   13.4  JobScheduler pool 5/8

From this output, I can pull up the thread's stack trace in jconsole or jstack manually and figure out what's going on.

NOTE: jtop is written in Python and requires that jstack be installed.



回答4:

Threads don't have names as far as the kernel is concerned; they only have ID numbers. The JVM assigns names to threads, but that's private internal data within the process, which the "top" program can't access (and doesn't know about anyway).



回答5:

With OpenJDK on Linux, JavaThread names don't propagate to native threads, you cannot see java thread name while inspecting native threads with any tool.

However there is some work in progress:

  • https://bugs.openjdk.java.net/browse/JDK-7102541
  • http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-July/006211.html

Personally, I find the OpenJDK development tool slow so I just apply patches myself.



回答6:

As far as I found out jstack is outdated as of JDK 8. What I used to retrieve all Java Thread names is:

<JDK_HOME>/bin/jcmd <PID> Thread.print

Check jcmd documentation for more.



回答7:

This shell script combines the output from jstack and top to list Java threads by CPU usage. It expects one argument, the account user that owns the processes.

Name: jstack-top.sh

#!/bin/sh
#
# jstack-top - join jstack and top to show cpu usage, etc.
#
# Usage: jstack-top <user> | view -
#

USER=$1
TOPS="/tmp/jstack-top-1.log"
JSKS="/tmp/jstack-top-2.log"

PIDS="$(ps -u ${USER} --no-headers -o pid:1,cmd:1 | grep 'bin/java' | grep -v 'grep' | cut -d' ' -f1)"
if [ -f ${JSKS} ]; then
    rm ${JSKS}
fi
for PID in ${PIDS}; do
    jstack -l ${PID} | grep "nid=" >>${JSKS}
done

top -u ${USER} -H -b -n 1 | grep "%CPU\|java" | sed -e 's/[[:space:]]*$//' > ${TOPS}
while IFS= read -r TOP; do
    NID=$(echo "${TOP}" | sed -e 's/^[[:space:]]*//' | cut -d' ' -f1)
    if [ "${NID}" = "PID" ]; then
        JSK=""
        TOP="${TOP} JSTACK"
    else
        NID=$(printf 'nid=0x%x' ${NID})
        JSK=$(grep "${NID} " ${JSKS})
    fi
    echo "${TOP}    ${JSK}"
done < "${TOPS}"


回答8:

Old question, but I had just the same problem with top.

It turns out, you can scroll top's output to the right simply by using the cursors keys :)

(but unfortunately there won't be any thread name shown)



回答9:

You mentioned "Linux". Then using the little tool "threadcpu" might be a solution:

threadcpu_-_show_cpu_usage_of_threads

$ threadcpu -h

threadcpu shows CPU usage of threads in user% and system%

usage:
threadcpu [-h] [-s seconds] [-p path-to-jstack]

options:
  -h display this help page
  -s measuring interval in seconds, default: 10
  -p path to JRE jstack, default: /usr/bin/jstack
example usage:
  threadcpu -s 30 -p /opt/java/bin/jstack 2>/dev/null|sort -n|tail -n 12
output columns:
  user percent <SPACE> system percent <SPACE> PID/NID [ <SPACE> JVM thread name OR (process name) ]

Some sample outputs:

$ threadcpu |sort -n|tail -n 8
3 0 33113 (klzagent)
3 0 38518 (klzagent)
3 0 9874 (BESClient)
3 41 6809 (threadcpu)
3 8 27353 VM Periodic Task Thread
6 0 31913 hybrisHTTP4
21 8 27347 C2 CompilerThread0
50 41 3244 (BESClient)

$ threadcpu |sort -n|tail -n 8
0 20 52358 (threadcpu)
0 40 32 (kswapd0)
2 50 2863 (BESClient)
11 0 31861 Gang worker#0 (Parallel CMS Threads)
11 0 31862 Gang worker#1 (Parallel CMS Threads)
11 0 31863 Gang worker#2 (Parallel CMS Threads)
11 0 31864 Gang worker#3 (Parallel CMS Threads)
47 10 31865 Concurrent Mark-Sweep GC Thread

$ threadcpu |sort -n|tail -n 8
2 0 14311 hybrisHTTP33
2 4 60077 ajp-bio-8009-exec-11609
2 8 30657 (klzagent)
4 0 5661 ajp-bio-8009-exec-11649
11 16 28144 (batchman)
15 20 3485 (BESClient)
21 0 7652 ajp-bio-8009-exec-11655
25 0 7611 ajp-bio-8009-exec-11654

The output is intentionally very simple to make further processing (e.g. for monitoring) more easy.



回答10:

Expanding on Andre's earlier answer in Perl, here is one in Python that runs significantly faster.

It re-uses files created earlier and does not loop several times over the jstack output:

#!/usr/bin/env python
import re
import sys
import os.path
import subprocess

# Check if jstack.log top.log files are present
if not os.path.exists("jstack.log") or not os.path.exists("top.log"):
  # Delete either file
  os.remove("jstack.log") if os.path.exists("jstack.log") else None
  os.remove("top.log") if os.path.exists("top.log") else None
  # And dump them via a bash run
  cmd = """
  pid=$(ps -e | grep java | sed 's/^[ ]*//g' | cut -d ' ' -f 1)
  top -H -b -n 1 | grep java > top.log
  /usr/intel/pkgs/java/1.8.0.141/bin/jstack -l $pid > jstack.log
  """
  subprocess.call(["bash", "-c", cmd])

# Verify that both files were written
for f in ["jstack.log", "top.log"]:
  if not os.path.exists(f):
    print "ERROR: Failed to create file %s" % f
    sys.exit(1)

# Thread ID parser
jsReg = re.compile('"([^\"]*)".*nid=(0x[0-9a-f]*)')
# Top line parser
topReg = re.compile('^\s*([0-9]*)(\s+[^\s]*){7}\s+([0-9]+)')

# Scan the entire jstack file for matches and put them into a dict
nids = {}
with open("jstack.log", "r") as jstack:
  matches = (jsReg.search(l) for l in jstack if "nid=0x" in l)
  for m in matches:
    nids[m.group(2)] = m.group(1)

# Print header
print "PID\tNID\tCPU\tTHREAD"
# Scan the top output and emit the matches
with open("top.log", "r") as top:
  matches = (topReg.search(l) for l in top)
  for m in matches:
    # Grab the pid, convert to hex and fetch from NIDS
    pid = int(m.group(1))
    nid = "0x%x" % pid
    tname = nids.get(nid, "<MISSING THREAD>")
    # Grab CPU percent
    pct = int(m.group(3))
    # Emit line
    print "%d\t%s\t%d\t%s" % (pid, nid, pct, tname)