当有java进程占用过多CPU时,可能是逻辑出现的问题。如何排查问题所在呢?
1. 使用top工具列出所有进程,shitf + p 列出CPU占用率较高进程
2. 找到问题进程号,使用top -H -p pid列出进程的所有线程
3. 然后shift + p 按照CPU使用率排序
4. 找出问题进程号,使用python打印出其16进制值,print("0x" % ppid),比如是:76a3
5. jstack pid > t.dat 记录线程堆栈,vi 打开找到76a3的线程号,结合源码定位问题
下面使用一个死循环的例子进行讲解:
2015-07-26 19:52:04 Full thread dump OpenJDK 64-Bit Server VM (20.0-b11 mixed mode): "Attach Listener" daemon prio=10 tid=0x00007fa04c001000 nid=0x7ac9 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "DestroyJavaVM" prio=10 tid=0x00007fa070007000 nid=0x7697 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "ttwo" prio=10 tid=0x00007fa0700b2000 nid=0x76a3 runnable [0x00007fa074260000] java.lang.Thread.State: RUNNABLE at Main$T2.run(Main.java:31) at java.lang.Thread.run(Thread.java:679) "tone" prio=10 tid=0x00007fa0700b0000 nid=0x76a2 waiting on condition [0x00007fa074361000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at Main$T1.run(Main.java:19) at java.lang.Thread.run(Thread.java:679) "Low Memory Detector" daemon prio=10 tid=0x00007fa070098800 nid=0x76a0 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" daemon prio=10 tid=0x00007fa070096000 nid=0x769f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" daemon prio=10 tid=0x00007fa070093800 nid=0x769e waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x00007fa070085000 nid=0x769d runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x00007fa070073000 nid=0x769c in Object.wait() [0x00007fa074973000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000ec0b1310> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133) - locked <0x00000000ec0b1310> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177) "Reference Handler" daemon prio=10 tid=0x00007fa070071000 nid=0x769b in Object.wait() [0x00007fa074a74000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000000ec0b11e8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133) - locked <0x00000000ec0b11e8> (a java.lang.ref.Reference$Lock) "VM Thread" prio=10 tid=0x00007fa07006a800 nid=0x769a runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fa070011800 nid=0x7698 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fa070013800 nid=0x7699 runnable "VM Periodic Task Thread" prio=10 tid=0x00007fa07009b000 nid=0x76a1 waiting on condition JNI global references: 865
上面的代码启动两个线程,线程T1会占用少量CPU,线程T2会占满一个CPU。
首先我们通过top列出进程,按照cpu使用率排序(shift + p)
可以看出出问题的进程号是30358
然后使用 top -H -p 30358 得到如下结果,然后使用shitf + p按照cpu使用率排序
这里可以看到线程为30371占用太多CPU,此线程有问题
接下来使用jstack 30358 > t.dat 记录线程堆栈
使用python打印出出30371的16进制值 (print("0x" % 30371结果为76a3
在线程堆栈中找到76a3进程,然后仔细查看堆栈信息
这里可以看到ttwo线程名,运行到了Main的31行,此时我们在去源代码中仔细查看附近的逻辑,问题一目了然
版权声明:本文为博主原创文章,未经博主允许不得转载。
原文地址:http://blog.csdn.net/yfkscu/article/details/47070747