一台suse 12 sp3服务器上起多个java进程,突然遇到“ java.lang.OutOfMemoryError: unable to create new native thread“的报错,而服务器本身仍然有大量free的内存,baidu之,几乎所有的文章都会提到下面几个jvm和操作系统参数:
1、Java虚拟机本身:-Xms,-Xmx,-Xss;
2、 系统限制:/proc/sys/kernel/pid_max,/proc/sys/kernel/thread-max,max_user_process(ulimit -u),/proc/sys/vm/max_map_count
详细内容请参考:JVM中可生成的最大Thread数量
很不幸,在服务器上进行了上述参数配置,还是会遇到OOM报错,启动java进程,运行top -H 查看系统的线程总数,最高在12250左右,然后,日志报错:”unable to create new native thread“。同时,查看系统message的日志(/var/log/message)发现如下信息:
kernel: cgroup: fork rejected by pids controller in /user.slice/user-1000.slice
经过研究,发现cgroup通过/sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max控制当前登录用户(uid为1000)启动的线程总数,执行echo 100000 > pids.max后,运行如下代码可以创建接近10W的线程,至此,问题解决。
最后,总结列出所有相关的jvm和操作系统参数
-Xms(intial java heap size) |
-Xmx(maximum java heap size) |
-Xss(the stack size for each thread) |
/proc/sys/kernel/pid_max |
/proc/sys/kernel/thread-max |
max_user_process(ulimit -u) |
/proc/sys/vm/max_map_count |
/sys/fs/cgroup/pids/user.slice/user-1000.slice/pids.max |
import java.util.ArrayList;
import java.util.List;
public class MaxThreadsMain {
public static final int BATCH_SIZE = 4000;
public static void main(String... args) throws InterruptedException {
List<Thread> threads = new ArrayList<Thread>();
try {
for (int i = 0; i <= 100 * 1000; i += BATCH_SIZE) {
long start = System.currentTimeMillis();
addThread(threads, BATCH_SIZE);
long end = System.currentTimeMillis();
Thread.sleep(1000);
long delay = end - start;
System.out.printf("%,d threads: Time to create %,d threads was %.3f seconds %n", threads.size(), BATCH_SIZE, delay / 1e3);
}
} catch (Throwable e) {
System.err.printf("After creating %,d threads, ", threads.size());
e.printStackTrace();
}
}
private static void addThread(List<Thread> threads, int num) {
for (int i = 0; i < num; i++) {
Thread t = new Thread(new Runnable() {
@Override
public void run() {
try {
while (!Thread.interrupted()) {
Thread.sleep(1000);
}
} catch (InterruptedException ignored) {
//
}
}
});
t.setDaemon(true);
t.setPriority(Thread.MIN_PRIORITY);
threads.add(t);
t.start();
}
}
}