为嘛死锁无处不在啊?出问题的这个进程是做后台计算的,三十个左右的NodeThread,基本上在执行相同的http请求时,依赖的服务没有及时响应,等待一段时间后超时,于是线程们开始写日志。基本上等于三十个线程同时开始记日志,要求获取aorg.apache.log4j.spi.RootCategory 对象的锁。多线程环境下,真是啥都不牢靠。
众网友反映这个问题在log4j 1.2*版的古老bug。至今还没解决... apache说1.3是个不连续的版本,2.0是实验性质的版本,唉,不靠谱。
还有同学说有些死锁是有些应用层日志api调用得有问题,不是人log4j的问题,那同学代码确实写的很bt,但是和我碰到的问题不是一个问题,就不深入讨论了。 http://mail-archives.apache.org/mod_mbox/logging-log4j-user/200810.mbox/%3C48FC8556.7020804@corporater.com%3E
-----------------------------------------------------------------------------------------------------------------------------------------------------------
有stack有真相。
一个线程
"Thread-2174" daemon prio=10 tid=0x00000000506a3000 nid=0x7a2 runnable [0x00000000452f8000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:260)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
- locked <0x00002aaab3611fb0> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:430)
- locked <0x00002aaab3611f78> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:272)
at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:276)
at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:122)
- locked <0x00002aaab589f280> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:212)
at org.apache.log4j.helpers.QuietWriter.flush(QuietWriter.java:49)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:306)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:150)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221)
- locked <0x00002aaab589e658> (a org.apache.log4j.ConsoleAppender)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)
at org.apache.log4j.Category.callAppenders(Category.java:187)
- locked <0x00002aaab5838af0> (a org.apache.log4j.spi.RootCategory)
at org.apache.log4j.Category.forcedLog(Category.java:372)
at org.apache.log4j.Category.info(Category.java:674)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveIndexWorker_Optimize.doIndex(XLiveIndexWorker_Optimize.java:278)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveIndexWorker_Optimize.doIndexes(XLiveIndexWorker_Optimize.java:185)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveTaskWorker.doWork(XLiveTaskWorker.java:23)
at com.wps.www.oaas.ftr.dispatcher.node.NodeThread.doWork(NodeThread.java:256)
at com.wps.www.oaas.ftr.dispatcher.node.NodeThread.run(NodeThread.java:185)
其它线程:
"Thread-2169" daemon prio=10 tid=0x0000000050ba0000 nid=0x6480 waiting for monitor entry [0x0000000044bf1000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:185)
- waiting to lock <0x00002aaab5838af0> (a org.apache.log4j.spi.RootCategory)
at org.apache.log4j.Category.forcedLog(Category.java:372)
at org.apache.log4j.Category.info(Category.java:674)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveIndexWorker_Optimize.doIndex(XLiveIndexWorker_Optimize.java:278)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveIndexWorker_Optimize.doIndexes(XLiveIndexWorker_Optimize.java:185)
at com.wps.www.oaas.ftr.dispatcher.task.XLiveTaskWorker.doWork(XLiveTaskWorker.java:23)
at com.wps.www.oaas.ftr.dispatcher.node.NodeThread.doWork(NodeThread.java:256)
at com.wps.www.oaas.ftr.dispatcher.node.NodeThread.run(NodeThread.java:185)