作为ctrl+c、ctrl+v的老手,终于在测试环境遇到了cpu100%的情况,说下排查思路。
-
通过top命令找到发现是mysql的问题
-
查询当前sql正执行的语句
show full processlist
没有找到耗时异常的操作语句
只找到了这么一条语句SELECT * FROM sec_prize_record limit 11230 , 5重复多次show full processlist ,都只见到类似的语句。
SELECT * FROM sec_prize_record limit 16505 , 5猜测是频繁执行查询的原因导致cpu较高
-
执行cpu分析脚本,得到以下内容
[1;31mThe stack of busy(2.1%) thread(20567/0x5057) of java process(1131) of user(root):ee[0m "sentinel-time-tick-thread" #2087 daemon prio=5 os_prio=0 tid=0x00007f8cd8177000 nid=0x5057 waiting on condition [0x00007f8cc6915000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at java.lang.Thread.sleep(Thread.java:340) at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:386) at com.alibaba.csp.sentinel.util.TimeUtil$1.run(TimeUtil.java:37) at java.lang.Thread.run(Thread.java:748) [1;31mThe stack of busy(0.5%) thread(1269/0x4f5) of java process(1131) of user(root):ee[0m "[channel=channel-EsAppender] - CheckpointBackUpThread" #107 prio=5 os_prio=0 tid=0x00007f8cf000d000 nid=0x4f5 waiting on condition [0x00007f8cc8b23000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000f0704738> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) [1;31mThe stack of busy(0.1%) thread(12781/0x31ed) of java process(1131) of user(root):ee[0m "XNIO-1 task-29" #5791 prio=5 os_prio=0 tid=0x00007f8d1c0a2000 nid=0x31ed runnable [0x00007f8cc2ef4000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at com.mysql.cj.protocol.ReadAheadInputStream.fill(ReadAheadInputStream.java:107) at com.mysql.cj.protocol.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:150) at com.mysql.cj.protocol.ReadAheadInputStream.read(ReadAheadInputStream.java:180) - locked <0x00000000f4e42be0> (a com.mysql.cj.protocol.ReadAheadInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:133) at com.mysql.cj.protocol.FullReadInputStream.readFully(FullReadInputStream.java:64) at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:63) at com.mysql.cj.protocol.a.SimplePacketReader.readHeader(SimplePacketReader.java:45) at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:52) at com.mysql.cj.protocol.a.TimeTrackingPacketReader.readHeader(TimeTrackingPacketReader.java:41) at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:54) at com.mysql.cj.protocol.a.MultiPacketReader.readHeader(MultiPacketReader.java:44) at com.mysql.cj.protocol.a.NativeProtocol.readMessage(NativeProtocol.java:535) at com.mysql.cj.protocol.a.NativeProtocol.checkErrorMessage(NativeProtocol.java:711) at com.mysql.cj.protocol.a.NativeProtocol.sendCommand(NativeProtocol.java:650) at com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:949) at com.mysql.cj.NativeSession.execSQL(NativeSession.java:1113) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:930) - locked <0x00000000f4934fc0> (a com.mysql.cj.jdbc.ConnectionImpl) at com.mysql.cj.jdbc.ClientPreparedStatement.execute(ClientPreparedStatement.java:370) - locked <0x00000000f4934fc0> (a com.mysql.cj.jdbc.ConnectionImpl) at com.p6spy.engine.wrapper.PreparedStatementWrapper.execute(PreparedStatementWrapper.java:362) at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:497) at org.apache.ibatis.executor.statement.PreparedStatementHandler.query(PreparedStatementHandler.java:64) at org.apache.ibatis.executor.statement.RoutingStatementHandler.query(RoutingStatementHandler.java:79) at sun.reflect.GeneratedMethodAccessor278.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:63) at com.sun.proxy.$Proxy270.query(Unknown Source) at com.baomidou.mybatisplus.core.executor.MybatisSimpleExecutor.doQuery(MybatisSimpleExecutor.java:67) at org.apache.ibatis.executor.BaseExecutor.queryFromDatabase(BaseExecutor.java:324) at org.apache.ibatis.executor.BaseExecutor.query(BaseExecutor.java:156) at org.apache.ibatis.executor.CachingExecutor.query(CachingExecutor.java:109) at com.github.pagehelper.PageInterceptor.intercept(PageInterceptor.java:143) at org.apache.ibatis.plugin.Plugin.invoke(Plugin.java:61) at com.sun.proxy.$Proxy269.query(Unknown Source) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:147) at org.apache.ibatis.session.defaults.DefaultSqlSession.selectList(DefaultSqlSession.java:140) at sun.reflect.GeneratedMethodAccessor279.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:426) at com.sun.proxy.$Proxy198.selectList(Unknown Source) at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:223) at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.executeForMany(MybatisMapperMethod.java:158) at com.baomidou.mybatisplus.core.override.MybatisMapperMethod.execute(MybatisMapperMethod.java:76) at com.baomidou.mybatisplus.core.override.MybatisMapperProxy.invoke(MybatisMapperProxy.java:61) at com.sun.proxy.$Proxy203.selectList(Unknown Source) at com.baomidou.mybatisplus.extension.service.impl.ServiceImpl.list(ServiceImpl.java:271) at com.wehotel.service.russian.impl.SecPrizeRecordServiceImpl.multipleQueryByCount(SecPrizeRecordServiceImpl.java:185) at com.wehotel.service.russian.impl.SecPrizeRecordServiceImpl.getListByWrapper(SecPrizeRecordServiceImpl.java:145) at com.wehotel.service.russian.impl.SecPrizeRecordServiceImpl$$FastClassBySpringCGLIB$$62ccc01e.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:685) at com.wehotel.service.russian.impl.SecPrizeRecordServiceImpl$$EnhancerBySpringCGLIB$$30e58e1.getListByWrapper(<generated>) at com.wehotel.controller.russian.SecPrizeRecordController.downloadRecord(SecPrizeRecordController.java:119) at com.wehotel.controller.russian.SecPrizeRecordController$$FastClassBySpringCGLIB$$df657f26.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218) ......(以下部分省略)
-
找到了自己写的代码

-
查看该代码写的是啥

-
分析:原来是一段ctrl+v 来的码,只是这里的MAX_QUERY当初为了调试方便,被我设置成了5,而这次要查询的总数大概有10多万条,导致需要多次查询,把这个MAX_QUERY调大点就好了。
-
附上网上找到的排查cpu异常脚本:
#!/bin/bash
# @Function
# Find out the highest cpu consumed threads of java, and print the stack of these threads.
# $ ./show-busy-threads
#ARGS= -p pid
#[ $? -ne 0]
PROG=`basename $0`
count=3
redEcho() {
[ -c /dev/stdout ] && {
echo -ne "\003[1;31m"
echo -n "$@"
echo -e "\0ee[0m"
} || echo "$@"
}
if ! which jstack &> /dev/null; then
[ -n "$JAVA_HOME" ] && [ -f "$JAVA_HOME/bin/jstack" ] && [ -x "$JAVA_HOME/bin/jstack" ] && {
export PATH="$JAVA_HOME/bin:$PATH"
} || {
redEcho "Error: jstack not found on PATH and JAVA_HOME!"
exit 1
}
fi
uuid=`date +%s`_${RANDOM}_$$
cleanupWhenExit() {
rm /tmp/${uuid}_* &> /dev/null
}
trap "cleanupWhenExit" EXIT
printStackOfThread() {
while read threadLine ; do
pid=`echo ${threadLine} | awk '{print $1}'`
threadId=`echo ${threadLine} | awk '{print $2}'`
threadId0x=`printf %x ${threadId}`
user=`echo ${threadLine} | awk '{print $3}'`
pcpu=`echo ${threadLine} | awk '{print $5}'`
jstackFile=/tmp/${uuid}_${pid}
[ ! -f "${jstackFile}" ] && {
jstack ${pid} > ${jstackFile} || {
redEcho "Fail to jstack java process ${pid}!"
rm ${jstackFile}
continue
}
}
redEcho "The stack of busy(${pcpu}%) thread(${threadId}/0x${threadId0x}) of java process(${pid}) of user(${user}):"
sed "/nid=0x${threadId0x}/,/^$/p" -n ${jstackFile}
done
}
[ -z "${pid}" ] && {
ps -Leo pid,lwp,user,comm,pcpu --no-headers | awk '$4=="java"{print $0}' |
sort -k5 -r -n | head --lines "${count}" | printStackOfThread
} || {
ps -Leo pid,lwp,user,comm,pcpu --no-headers | awk -v "pid=${pid}" '$1==pid,$4=="java"{print $0}' |
sort -k5 -r -n | head --lines "${count}" | printStackOfThread
}
本文分享了一次测试环境中遇到CPU使用率高达100%的问题排查经历,通过top命令定位到MySQL问题,深入分析SQL执行情况及代码层面原因,并提供解决思路。

1万+

被折叠的 条评论
为什么被折叠?



