在近期的性能优化中,使用了线程池,线程池的定义如下:
ExecutorService executorService = new ThreadPoolExecutor(threadPoolSize,
threadPoolMaxSize, timeout,
TimeUnit.SECONDS,
new LinkedBlockingQueue<>(30000),
Executors.defaultThreadFactory(),
new ThreadPoolExecutor.AbortPolicy());
现有55000个请求,大概在发送30000零几个请求的后,就不在发送请求了。countdownLuch也不做减法了。
原因是ThreadPoolExecutor 中的 new LinkedBlockingQueue<>(30000),即阻塞队列为30000个,在往线程池中塞请求的时候,塞的请求多余3W个时候,因为我使用的是ThreadPoolExceutor默认的拒绝策略。所有其他的1W多个都抛弃了。
为什么不是整整3W,而是3W零几个呢,因为最初发送的几个请求返回了结果,释放了几个线程,在3W个队列塞满之前,又多加了几个进去。
===============
问题的定位
最初的猜猜是 1。出现了死锁 ,使用jps -m 查看死锁(参考这篇文章),也可以使用jstack PID 查看(参考这篇文章)
2。因为请求时间过长,线程池资源被耗尽
使用jstack PID后,发现下面的警告
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000080135840> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
--
"pool-1-thread-6" #114 prio=5 os_prio=0 tid=0x00007f54908f5000 nid=0x5f67 waiting on condition [0x00007f5441901000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000080135840> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
--
"pool-1-thread-5" #113 prio=5 os_prio=0 tid=0x00007f54908f3000 nid=0x5f66 waiting on condition [0x00007f5441a02000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000080135840> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
看到疑问的时候,想线程池为什么会等待呢,阻塞队列为什么会等待呢
突然想到了上面的问题。
=========
解决方案:
1. 可以直接修改阻塞队列的大小
2. 将5W的请求分配,例如每100个一批发送请求,等这100个请求的结果都拿到后,在发送第二批的请求。这样也可以方式突然发送5W的请求,内存等资源被耗尽等问题