HttpClient连接池抛出大量ConnectionPoolTimeoutException: Timeout waiting for connection异常排查...

本文介绍了排查HttpClient使用时遇到的ConnectionPoolTimeoutException异常问题,通过分析服务器日志和Linux网络状态,发现问题源于HttpClient连接池配置及代码处理不当。调整HttpConnectionManager的Max_ROUTE_CONNECTIONS参数,并修正HttpGet的异常处理,确保非200状态时能正确关闭连接,解决了CLOSE_WAIT状态过多导致的服务器性能问题。
部署运行你感兴趣的模型镜像

今天解决了一个HttpClient的异常,汗啊,一个HttpClient使用稍有不慎都会是毁灭级别的啊。

这里有之前因为route配置不当导致服务器异常的一个处理:http://blog.youkuaiyun.com/shootyou/article/details/6415248

里面的HttpConnectionManager实现就是我在这里使用的实现。

 

问题表现:

tomcat后台日志发现大量异常

 

org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection
时间一长tomcat就无法继续处理其他请求,从假死变成真死了。

 

linux运行:

netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'发现CLOSE_WAIT的数量始终在400以上,一直没降过。

 

 

问题分析:

一开始我对我的HttpClient使用过程深信不疑,我不认为异常是来自这里。

所以我开始从TCP的连接状态入手,猜测可能导致异常的原因。以前经常遇到TIME_WAIT数过大导致的服务器异常,很容易解决,修改下sysctl就ok了。但是这次是CLOSE_WAIT,是完全不同的概念了。

关于TIME_WAIT和CLOSE_WAIT的区别和异常处理我会单独起一篇文章详细说说我的理解。

 

简单来说CLOSE_WAIT数目过大是由于被动关闭连接处理不当导致的。

我说一个场景,服务器A会去请求服务器B上面的apache获取文件资源,正常情况下,如果请求成功,那么在抓取完资源后服务器A会主动发出关闭连接的请求,这个时候就是主动关闭连接,连接状态我们可以看到是TIME_WAIT。如果一旦发生异常呢?假设请求的资源服务器B上并不存在,那么这个时候就会由服务器B发出关闭连接的请求,服务器A就是被动的关闭了连接,如果服务器A被动关闭连接之后自己并没有释放连接,那就会造成CLOSE_WAIT的状态了。

所以很明显,问题还是处在程序里头。

 

先看看我的HttpConnectionManager实现:

 

public class HttpConnectionManager { private static HttpParams httpParams; private static ClientConnectionManager connectionManager; /** * 最大连接数 */ public final static int MAX_TOTAL_CONNECTIONS = 800; /** * 获取连接的最大等待时间 */ public final static int WAIT_TIMEOUT = 60000; /** * 每个路由最大连接数 */ public final static int MAX_ROUTE_CONNECTIONS = 400; /** * 连接超时时间 */ public final static int CONNECT_TIMEOUT = 10000; /** * 读取超时时间 */ public final static int READ_TIMEOUT = 10000; static { httpParams = new BasicHttpParams(); // 设置最大连接数 ConnManagerParams.setMaxTotalConnections(httpParams, MAX_TOTAL_CONNECTIONS); // 设置获取连接的最大等待时间 ConnManagerParams.setTimeout(httpParams, WAIT_TIMEOUT); // 设置每个路由最大连接数 ConnPerRouteBean connPerRoute = new ConnPerRouteBean(MAX_ROUTE_CONNECTIONS); ConnManagerParams.setMaxConnectionsPerRoute(httpParams,connPerRoute); // 设置连接超时时间 HttpConnectionParams.setConnectionTimeout(httpParams, CONNECT_TIMEOUT); // 设置读取超时时间 HttpConnectionParams.setSoTimeout(httpParams, READ_TIMEOUT); SchemeRegistry registry = new SchemeRegistry(); registry.register(new Scheme("http", PlainSocketFactory.getSocketFactory(), 80)); registry.register(new Scheme("https", SSLSocketFactory.getSocketFactory(), 443)); connectionManager = new ThreadSafeClientConnManager(httpParams, registry); } public static HttpClient getHttpClient() { return new DefaultHttpClient(connectionManager, httpParams); } }

 

 

看到没MAX_ROUTE_CONNECTIONS 正好是400,跟CLOSE_WAIT非常接近啊,难道是巧合?继续往下看。

然后看看调用它的代码是什么样的:

 

public static String readNet (String urlPath) { StringBuffer sb = new StringBuffer (); HttpClient client = null; InputStream in = null; InputStreamReader isr = null; try { client = HttpConnectionManager.getHttpClient(); HttpGet get = new HttpGet(); get.setURI(new URI(urlPath)); HttpResponse response = client.execute(get); if (response.getStatusLine ().getStatusCode () != 200) { return null; } HttpEntity entity =response.getEntity(); if( entity != null ){ in = entity.getContent(); ..... } return sb.toString (); } catch (Exception e) { e.printStackTrace (); return null; } finally { if (isr != null){ try { isr.close (); } catch (IOException e) { e.printStackTrace (); } } if (in != null){ try { <span style="color: #ff0000;" mce_style="color: #ff0000;">in.close ();</span> } catch (IOException e) { e.printStackTrace (); } } } }
很简单,就是个远程读取中文页面的方法。值得注意的是这一段代码是后来某某同学加上去的,看上去没啥问题,是用于非200状态的异常处理:

 

 

if (response.getStatusLine ().getStatusCode () != 200) { return null; }
代码本身没有问题,但是问题是放错了位置。如果这么写的话就没问题:

 

 

client = HttpConnectionManager.getHttpClient(); HttpGet get = new HttpGet(); get.setURI(new URI(urlPath)); HttpResponse response = client.execute(get); HttpEntity entity =response.getEntity(); if( entity != null ){ in = entity.getContent(); .......... } if (response.getStatusLine ().getStatusCode () != 200) { return null; } return sb.toString ();看出毛病了吧。在这篇入门(HttpClient4.X 升级 入门 + http连接池使用)里头我提到了HttpClient4使用我们常用的InputStream.close()来确认连接关闭,前面那种写法InputStream in 根本就不会被赋值,意味着一旦出现非200的连接,这个连接将永远僵死在连接池里头,太恐怖了。。。所以我们看到CLOST_WAIT数目为400,因为对一个路由的连接已经完全被僵死连接占满了。。。

 

其实上面那段代码还有一个没处理好的地方,异常处理不够严谨,所以最后我把代码改成了这样:

 

public static String readNet (String urlPath) { StringBuffer sb = new StringBuffer (); HttpClient client = null; InputStream in = null; InputStreamReader isr = null; HttpGet get = new HttpGet(); try { client = HttpConnectionManager.getHttpClient(); get.setURI(new URI(urlPath)); HttpResponse response = client.execute(get); if (response.getStatusLine ().getStatusCode () != 200) { get.abort(); return null; } HttpEntity entity =response.getEntity(); if( entity != null ){ in = entity.getContent(); ...... } return sb.toString (); } catch (Exception e) { get.abort(); e.printStackTrace (); return null; } finally { if (isr != null){ try { isr.close (); } catch (IOException e) { e.printStackTrace (); } } if (in != null){ try { in.close (); } catch (IOException e) { e.printStackTrace (); } } } }

 

显示调用HttpGet的abort,这样就会直接中止这次连接,我们在遇到异常的时候应该显示调用,因为谁能保证异常是在InputStream in赋值之后才抛出的呢。

 

好了 ,分析完毕,明天准备总结下CLOSE_WAIT和TIME_WAIT的区别。

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

Stable-Diffusion-3.5

图片生成
Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型,相比 3.0 版本,它提升了图像质量、运行速度和硬件效率

D:\UseApp\jdk1.8.0_101\bin\java.exe -agentlib:jdwp=transport=dt_socket,address=127.0.0.1:63995,suspend=y,server=n -XX:TieredStopAtLevel=1 -noverify -Dspring.output.ansi.enabled=always -Dcom.sun.management.jmxremote -Dspring.jmx.enabled=true -Dspring.liveBeansView.mbeanDomain -Dspring.application.admin.enabled=true -javaagent:D:\UseApp\ideaIU-2020.1.1\plugins\java\lib\rt\debugger-agent.jar=file:/C:/Users/LEGION/AppData/Local/Temp/capture9.props -Dfile.encoding=UTF-8 -classpath C:\Users\LEGION\AppData\Local\Temp\classpath902594147.jar com.bonc.activityGenerate.ActivityGenerateApplication Connected to the target VM, address: '127.0.0.1:63995', transport: 'socket' 2025-06-10 22:43:55.088 DEBUG 49400 --- [ main] c.a.n.client.env.SearchableProperties : properties search order:PROPERTIES->JVM->ENV 2025-06-10 22:43:55.386 INFO 49400 --- [ main] c.a.n.p.a.s.c.ClientAuthPluginManager : [ClientAuthPluginManager] Load ClientAuthService com.alibaba.nacos.client.auth.impl.NacosClientAuthServiceImpl success. 2025-06-10 22:43:55.386 INFO 49400 --- [ main] c.a.n.p.a.s.c.ClientAuthPluginManager : [ClientAuthPluginManager] Load ClientAuthService com.alibaba.nacos.client.auth.ram.RamClientAuthServiceImpl success. . ____ _ __ _ _ /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \ ( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ \\/ ___)| |_)| | | | | || (_| | ) ) ) ) ' |____| .__|_| |_|_| |_\__, | / / / / =========|_|==============|___/=/_/_/_/ :: Spring Boot :: (v2.6.11) 2025-06-10 22:43:57.279 INFO 49400 --- [ main] c.a.n.c.a.AbstractAbilityControlManager : Ready to get current node abilities... 2025-06-10 22:43:57.281 INFO 49400 --- [ main] c.a.n.c.a.AbstractAbilityControlManager : Ready to initialize current node abilities, support modes: [SDK_CLIENT] 2025-06-10 22:43:57.282 INFO 49400 --- [ main] c.a.n.c.a.AbstractAbilityControlManager : Initialize current abilities finish... 2025-06-10 22:43:57.283 INFO 49400 --- [ main] c.a.n.c.a.d.NacosAbilityManagerHolder : [AbilityControlManager] Successfully initialize AbilityControlManager 2025-06-10 22:43:57.372 WARN 49400 --- [ main] c.a.c.n.c.NacosPropertySourceBuilder : Ignore the empty nacos configuration and get it based on dataId[redisson-service] & group[DEFAULT_GROUP] 2025-06-10 22:43:57.378 WARN 49400 --- [ main] c.a.c.n.c.NacosPropertySourceBuilder : Ignore the empty nacos configuration and get it based on dataId[redisson-service.yml] & group[DEFAULT_GROUP] 2025-06-10 22:43:57.560 INFO 49400 --- [ main] b.c.PropertySourceBootstrapConfiguration : Located property source: [BootstrapPropertySource@1562251195 {name='bootstrapProperties-redisson-service-prod.yml,DEFAULT_GROUP', properties={spring.mvc.pathmatch.matching-strategy=ANT_PATH_MATCHER, spring.http.multipart.enabled=true, spring.http.multipart.max-file-size=-1, spring.http.multipart.max-request-size=-1, spring.datasource.dynamic.primary=tri, spring.datasource.dynamic.strict=false, spring.datasource.dynamic.datasource.tri.driver-class-name=com.mysql.jdbc.Driver, spring.datasource.dynamic.datasource.tri.url=jdbc:mysql://172.16.50.128:8921/sdiri_db?useSSL=false&useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&transformedBitIsBoolean=true&tinyInt1isBit=false&allowMultiQueries=true&serverTimezone=Asia/Shanghai, spring.datasource.dynamic.datasource.tri.username=e_sdiri, spring.datasource.dynamic.datasource.tri.password=uHF4CiXAMR5pe3AL, spring.datasource.dynamic.datasource.tri.type=com.zaxxer.hikari.HikariDataSource, spring.datasource.dynamic.datasource.tri.hikari.maximum-pool-size=40, spring.datasource.dynamic.datasource.tri.hikari.minimum-idle=1, spring.datasource.dynamic.datasource.tri.hikari.idle-timeout=30000, spring.datasource.dynamic.datasource.tri.hikari.connection-timeout=30000, spring.datasource.dynamic.datasource.tri.hikari.max-lifetime=1800000, spring.datasource.dynamic.datasource.tri.hikari.pool-name=MyAppPool, tri.activityServerUrl=http://172.16.50.140:30913, tri.activityEditUrl=http://172.16.50.140:30913/bloc-web/#/activity/edit/index/, tri.AIGCGenerateImgServerUrl=http://172.16.20.29:30804/sd-api, tri.standardServerUrl=http://10.135.139.99:30753, tri.yunXiActivityDescUrl=/anlikuHtml/caselibrary.html#/caseIndexSSO, tri.llmServerUrl=http://10.218.27.125:30871, tri.llmModelName=Qwen2.5-32B-Instruct, logging.config=classpath:logback-spring.xml, feign.httpclient.max-connections-per-route=200, feign.httpclient.max-connections=1000, feign.circuitbreaker.enabled=true, feign.client.config.default.connect-timeout=300000, feign.client.config.default.read-timeout=300000, server.port=31963, server.servlet.context-path=/act, login.skipUrls[0]=/user/login, login.skipUrls[1]=/user/getLoginCode, login.skipUrls[2]=/user/register, login.skipUrls[3]=/task/logWebsocket, login.skipUrls[4]=/server/**/api/chat, login.skipUrls[5]=/wsApiChat, login.skipUrls[6]=/server/serverDesc/**, login.skipUrls[7]=/server/openServerList, login.skipUrls[8]=/server/aiCloudPhone/**, login.skipUrls[9]=/open/jiKe/**, login.skipUrls[10]=/**/*.html, login.skipUrls[11]=/**/*.js, login.skipUrls[12]=/**/*.css, login.skipUrls[13]=/**/*.icon, login.skipUrls[14]=/**/*.gif, login.skipUrls[15]=/**/*.jpg, login.skipUrls[16]=/**/*.png, login.skipUrls[17]=/chat/**, login.skipUrls[18]=/swagger-resources/**, login.skipUrls[19]=/v2/}}, BootstrapPropertySource@787298292 {name='bootstrapProperties-redisson-service.yml,DEFAULT_GROUP', properties={}}, BootstrapPropertySource@163015520 {name='bootstrapProperties-redisson-service,DEFAULT_GROUP', properties={}}] _ _ |_ _ _|_. ___ _ | _ | | |\/|_)(_| | |_\ |_)||_|_\ / | 3.4.1 Disconnected from the target VM, address: '127.0.0.1:63995', transport: 'socket' Process finished with exit code 0 项目启动失败,没有多余的日志,根据日志知道Nacos已经连接成功,为什么会启动失败呢?
最新发布
06-11
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值