背景
有个线程池,大概长这样:
ThreadPoolExecutor executor = new ThreadPoolExecutor(
1,
1,
0L,
TimeUnit.MILLISECONDS,
new LinkedBlockingQueue<>(1000),
new ThreadPoolExecutor.CallerRunsPolicy()
);
一开始运行的好好的,后面怎么提交任务都没有反应。
排查
首先推测是任务执行太慢之类的问题给阻塞住了
检查了下调用方代码,大概只有两种:
A方法
executor.execute(() -> do http request)
B方法
executor.execute(() -> A方法)
一开始感觉B方法看着很有问题,甚至还看了下线程池调度流程,发现完全没问题,
查看了日志也没有什么相关的报错。
程序上没线索就只能看下线程状态了
使用jstate
工具定位到问题线程
发现它一直都卡在了java.net.SocketInputStream.socketRead0
具体线程调用栈如下:
Thread xxx: (state = IN_NATIVE)
- java.net.SocketInputStream.socketRead0(java.io.FileDescriptor, byte[], int, int, int) @bci=0 (Compiled frame; information may be imprecise)
- java.net.SocketInputStream.socketRead(java.io.FileDescriptor, byte[], int, int, int) @bci=8, line=116 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int, int) @bci=117, line=171 (Compiled frame)
- java.net.SocketInputStream.read(byte[], int, int) @bci=11, line=141 (Compiled frame)
- sun.security.ssl.SSLSocketInputRecord.read(java.io.InputStream, byte[], int, int) @bci=4, line=466 (Compiled frame)
- sun.security.ssl.SSLSocketInputRecord.readHeader() @bci=31, line=460 (Compiled frame)
- sun.security.ssl.SSLSocketInputRecord.decode(java.nio.ByteBuffer[], int, int) @bci=10, line=159 (Compiled frame)
- sun.security.ssl.SSLTransport.decode(sun.security.ssl.TransportContext, java.nio.ByteBuffer[], int, int, java.nio.ByteBuffer[], int, int) @bci=10, line=110 (Compiled frame)
- sun.security.ssl.SSLSocketImpl.decode(java.nio.ByteBuffer) @bci=14, line=1198 (Compiled frame)
- sun.security.ssl.SSLSocketImpl.readHandshakeRecord() @bci=12, line=1107 (Compiled frame)
- sun.security.ssl.SSLSocketImpl.startHandshake(boolean) @bci=122, line=400 (Compiled frame)
- sun.security.ssl.SSLSocketImpl.startHandshake() @bci=2, line=372 (Compiled frame)
- sun.net.www.protocol.https.HttpsClient.afterConnect() @bci=254, line=587 (Interpreted frame)
- sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect() @bci=51, line=185 (Interpreted frame)
- sun.net.www.protocol.https.HttpsURLConnectionImpl.connect() @bci=4, line=167 (Interpreted frame)
- org.springframework.http.client.SimpleBufferingClientHttpRequest.executeInternal(org.springframework.http.HttpHeaders, byte[]) @bci=61, line=76 (Interpreted frame)
- org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(org.springframework.http.HttpHeaders) @bci=27, line=48 (Interpreted frame)
- org.springframework.http.client.AbstractClientHttpRequest.execute() @bci=9, line=53 (Interpreted frame)
- org.springframework.web.client.RestTemplate.doExecute(java.net.URI, org.springframework.http.HttpMethod, org.springframework.web.client.RequestCallback, org.springframework.web.client.ResponseExtractor) @bci=37, line=742 (Interpreted frame)
- org.springframework.web.client.RestTemplate.execute(java.lang.String, org.springframework.http.HttpMethod, org.springframework.web.client.RequestCallback, org.springframework.web.client.ResponseExtractor, java.lang.Object[]) @bci=21, line=677 (Compiled frame)
- org.springframework.web.client.RestTemplate.exchange(java.lang.String, org.springframework.http.HttpMethod, org.springframework.http.HttpEntity, java.lang.Class, java.lang.Object[]) @bci=26, line=586 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1149 (Interpreted frame)
查资料分析了Oracle官网有提交过bug记录:
这篇没看懂…
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8075484
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=8233660
看了下我的jdk版本:
8u261
emmm…重…重启大法…
后续
一段时间后问题还是出现了…
我回头有研读了一遍调用链路
发现RestTemplate默认创建的请求工厂SimpleClientHttpRequestFactory默认超时时间是无限制的
导致构建出的SimpleBufferingClientHttpRequest对象在建立Socket通道时无限阻塞了