How to collect performance data on Linux
Collect the following information when high CPU consumption is with IBM Java process:
- Enable garbage collection trace to see whether Java garbage collection is thrashing if possible. If you want to enable Java garbage collection trace on IBM WebSphere Application Server, please refer to the following document: Enabling verbose garbage collection (verbosegc) in WebSphere Application Server
- Run the following command:
top -d delaytime -c -b > top.log
Where delaytime is the number of seconds to delay. This must be 60 seconds or greater, depending on how soon the failure is expected.
用TOP指令发现有99.9%的CPU占用情况时,mpstat -P ALL 5 ,看下每个CPU的使用情况,如果有单个CPU稳定在100使用率,基本上就是有死循环。 - Create a script file, vmstat.sh with the following content:
#vmstat.sh #output file name VMSTAT_LOG=$1 LIMIT=288 #sleep for 5 miniutes SLEEP_TIME=300 while true do i=0 echo >$VMSTAT_LOG while [ $i -le "$LIMIT" ]; do date >> $VMSTAT_LOG; vmstat 5 12 >> $VMSTAT_LOG; i=`expr $i + 1`; sleep $SLEEP_TIME; done done
- Create a script, ps.sh with the following content:这里可以修改为之针对某个PID ps -Lf <pid>
#ps.sh #output file name PS_LOG=$1 LIMIT=288 #sleep for 5 miniutes SLEEP_TIME=300 while true do i=0 echo >$PS_LOG while [ $i -le "$LIMIT" ]; do date >> $PS_LOG; ps -eLf >> $PS_LOG; i=`expr $i + 1`; sleep $SLEEP_TIME; done done
- Run the scripts:
./ps.sh ps_eLf.log
./vmstat.sh vmstat.log
Notes: . The scripts ps.sh and vmstat.sh, as provided, roll over every 24 hours. . You might need to modify the scripts to meet your needs. . The preceding scripts will run forever. After the error condition is reached, you will have to terminate them. - When high CPU consumption occurs, collect the following logs:
netstat -an > netstat1.out - If the Web server is remote, run the following on the Web server system:
netstat -an > netstatwebserver1.out - Run the following:
kill -3 [PID_of_problem_JVM] The kill -3 commands create javacore*.txt files
Note: If you are not able to determine which JVM process is experiencing the high CPU usage then you should issue the kill -3 PID for each of the JVM processes. - Wait two minutes.
- Run the following:
kill -3 [PID_of_problem_JVM] - Wait two minutes.
- Run the following:
kill -3 [PID_of_problem_JVM] - Wait two minutes.
- Run the following:
netstat -an > netstat2.out - If the Web server is remote, run the following on the Web server system:
netstat -an > netstatwebserver2.out - If you are unable to generate javacore files, then perform the following:
kill -11 [PID_of_problem_JVM]
WARNING: kill -11 will terminate the JVM process, produce a core file, and possibly a javacore. - Review all output files and collect the following files for IBM Performance Analysis Tool for Java for Linux
- ps_eLf.log
- javacore*.txt files
用jca分析javacore文件的结果,一定要耐心仔细看,找你自己项目里的相关类。如果是代码的问题(基本上都是),那么基本上就是你在javacore文件thread分析结果里看的那个了。如果没有,任重道远。。。。。。。
If you want to analyze the Java thread dumps, download the IBM Thread and Monitor Dump Analyzer for Java (TMDA). TMDA is one of top alphaWorks technologies that can analyze thread dumps from Java virtual machine. It is useful for identifying deadlocks, contention, bottlenecks, and to summarize the state of threads within Java virtual machine.
If garbage collection activity seems to be causing performance degradation or high processor time consumption, Verbose GC logging can be enabled. Enabling the generation of Verbose GC logging is done using the command line option: -verbose:gc. This causes the Verbose GC logging to be written to stderr or stdout.
If you want to analyze the Java Verbose GC log, download the IBM Pattern Modeling and Analysis Tool for Java Garbage Collector (PMAT). PMAT is one of top alphaWorks technologies that can parses verbose GC trace, analyzes Java heap usage, and recommends key configurations based on pattern modeling of Java heap usage.