并发编程学习——线程池前奏篇_intstream.rangeclosed-优快云博客

本文链接：https://blog.youkuaiyun.com/q_all_is_well/article/details/108094354

本文介绍了Java8并行流的概念，通过示例展示了使用线程池执行并行操作的五种方法，包括直接使用线程、`Executors.newFixedThreadPool`、`ForkJoinPool`以及`ForkJoinPool.commonPool()`。讨论了ForkJoinPool与ThreadPoolExecutor的区别，并指出在业务代码中常见使用线程池和并行流的方式。文章强调了正确设置并行度以避免性能影响的重要性，并预告了接下来的线程池总结文章。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

入职之后看书比较少，痛定思痛后，准备重读Java经典书目。那就从Java8实战开始研读。在看到并行流时，决定抽出时间来，将多线程部分知识的各个点传成一条线来整理学习一番。此为第一篇。

Java8并行流

Java8提供了并行流的功能，通过parallel方法，将Stream转换为并行操作提交到线程池处理。比如如下代码通过线程池并行消费处理1到100：

IntStream.rangeClosed(1, 100).parallel().forEach(i -> {
            System.out.println(LocalDateTime.now() + ":" + i);
            try{
                Thread.sleep(100);
            }catch (InterruptedException e){
                
            }
        });

并行流不确保执行顺序，并且因为每次处理耗时1秒，所以可以看到在机器上，数组按照CPU中核的数目为基准输出。
为了测试这种方法的有效性，可以通过这样一个场景来实现：

使用20个线程（threadCount）以并行方式总计执行10000次（taskCount）操作。因为单个任务单线程执行需要10毫秒（任务代码如下），也就是每秒吞吐量是100个操作，那20个线程QPS是2000，执行完10000次操作最少耗时5秒。

    public void increment(AtomicInteger atomicInteger){
        atomicInteger.incrementAndGet();
        try {
            TimeUnit.MILLISECONDS.sleep(10);
        }catch (InterruptedException e){ 
            e.printStackTrace();
        }
    }

第一种方式是使用线程。直接把任务按照线程数均匀分割，分割到不同的线程执行，使用CountDownLatch来阻塞主线程，直到所有线程都完成操作。这种方式，需要我们自己分割任务：

    private int thread(int taskCount, int threadCount) throws InterruptedException {
        // 总操作次数计数器
        AtomicInteger atomicInteger = new AtomicInteger();
        // 使用CountDownLatch来等待所有线程执行完成
        CountDownLatch countDownLatch = new CountDownLatch(threadCount);
        // 使用IntStream把数字直接转Thread
        IntStream.rangeClosed(1,threadCount).mapToObj(i -> new Thread(() -> {
            // 手动把taskCount分成taskCount份，每一份有一个线程执行
            IntStream.rangeClosed(1, taskCount / threadCount).forEach(j -> increment(atomicInteger));
            // 每一个线程处理完成自己那部分数据之后，countDown一次
        })).forEach(Thread::start);
        // 等到所有线程执行完成
        countDownLatch.await();
        // 查询计数器当前值
        return atomicInteger.get();
    }

第二种方式，使用Executors.newFixedThreadPool 来获得固定线程数的线程池，使用execute提交所有任务到线程池执行，最后关闭线程池等待所有任务执行完成。

    private  int threadPool(int taskCount, int threadCount) throws InterruptedException {
        // 总操作次数计数器
        AtomicInteger atomicInteger = new AtomicInteger();
        // 初始化一个线程数量=threadCount的线程池
        ExecutorService executorService = Executors.newFixedThreadPool(threadCount);
        // 所有任务直接提交到线程池处理
        IntStream.rangeClosed(1,taskCount).forEach(i -> executorService.execute(()->increment(atomicInteger)));
        // 提交关闭线程池申请，等待之前所有任务执行完成
        executorService.shutdown();
        executorService.awaitTermination(1, TimeUnit.HOURS);
        // 查询计数器当前值
        return atomicInteger.get();
    }

第三种方式是使用ForkJoinPool 而不是普通线程池执行任务。
ForkJoinPool 和传统的ThreadPoolExecutor 区别在于，前者对于n并行度有n个独立队列，后者是共享队列。如果有大量执行耗时比较短的任务，ThreadPoolExecutor的单队列就可能会成为瓶颈，这时，使用ForkJoinPool性能会更好。
因此，ForkJoinPool更适合大任务分割成许多小任务并行执行的场景，而ThreadPoolExecutor适合许多独立任务并发执行的场景。
我们先定义一个具有指定并行数的ForkJoinPool，再通过这个ForkJoinPool并行执行操作：

    private int forkjoin(int taskCount, int threadCount) throws InterruptedException {
        // 总操作计数器
        AtomicInteger atomicInteger = new AtomicInteger();
        // 自定义一个并行度=threadCount的ForkJoinPool
        ForkJoinPool forkJoinPool = new ForkJoinPool(threadCount);
        // 所有任务直接提交到线程池处理
        forkJoinPool.execute(() -> IntStream.rangeClosed(1,taskCount).parallel().forEach(i -> increment(atomicInteger)));
        // 提交关闭线程池申请，等待之前所有任务执行完成
        forkJoinPool.shutdown();
        forkJoinPool.awaitTermination(1, TimeUnit.HOURS);
        return atomicInteger.get();
    }

第四种方式是，直接使用并行流，并行流使用公共的ForkJoinPool,也就是ForkJoinPool.commonPool()
公共的ForkJoinPool默认的并行度是CPU核数-1，原因是对于CPU绑定的任务分配超过CPU个数的线程没有意义。由于并行流还会使用主线程执行任务，也会占用一个CPU核心，所以公共ForkJoinPool的并行度，即使-1也能用满所有CPU核心。
这里，通过配置强制指定（增大）了并行数，但因为使用的是公共ForkJoinPool，所以可能会存在干扰。

    private int bingxingstream(int taskCount, int threadCount) {
        // 设置公共ForkJoinPool的并行度
        System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", String.valueOf(threadCount));
        // 总操作次数计数器
        AtomicInteger atomicInteger = new AtomicInteger();
        // 由于设置了公共ForkJoinPool的并行度，直接使用parallel提交任务即可。
        IntStream.rangeClosed(1, taskCount).parallel().forEach(i ->increment(atomicInteger));
        // 查询计数器当前值
        return atomicInteger.get();
    }

第五种方式是，使用CompletableFuture来实现。CompletableFuture.runAsync 方法可以指定一个线程池，一般会在使用CompletableFuture的时候用到：

    private int completableFuture(int taskCount, int threadCount) throws ExecutionException, InterruptedException {
        // 总操作次数计数器
        AtomicInteger atomicInteger = new AtomicInteger();
        // 自定义一个并行度 = threadCount的ForkJoinPool
        ForkJoinPool forkJoinPool = new ForkJoinPool(threadCount);
        // 使用CompletableFuture.runAsync 通过指定线程池异步执行任务
        CompletableFuture.runAsync(()->IntStream.rangeClosed(1,taskCount).parallel().forEach(i -> increment(atomicInteger)), forkJoinPool).get();
        // 查询计数器当前值
        return atomicInteger.get();
    }

上面这5种方法都可以实现类似的效果
在这里插入图片描述

这些结果只能证明并行度的设置是有效的，并不是性能比较。如果程序对性能要求特别敏感，建议通过性能测试根据场景决定适合的模式。一般而言，使用线程池（第二种）和直接使用并行流（第四种）的方式在业务代码中比较常用。但需要注意的是，通常会重用线程池，所以业务逻辑不会像demo那样，直接声明线程池，等操作完成后再关闭。
**需要注意的是，上面例子中，一定要先运行stream方法再运行forkJoin方法，对公共ForkJoinPool默认并行度的修改才能生效。**为啥呢？因为ForkJoinPool类初始化公共线程池是在静态代码块里，加载类时就会进行的，如果forkJoin方法中先使用了ForkJoinPool,即便stream方法中设置了系统属性也不会起作用。因此设置ForkJoinPool公共线程池默认并行度的操作，应该放在应用启动时设置（Application类里）

本周末（20200823）将整理出一篇线程池总结，敬请期待