Java8: Stream Collector-优快云博客

本文详细介绍Java Stream API中的流收集器，包括常用的收集器如joining、groupingBy和partitioningBy等，并提供自定义收集器的具体实现示例。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

写在前面: 本文要介紹的是流收集器.在前面的文章中,我们都知道如果要收集流数据,调用collect方法即可.本文主要是介紹常用的流收集器和自定義流收集器.

1 Common Stream Collectors

说一说常用的流收集器,这里只是简单介绍用法,具体是如何实现的我会在下面进行解释.

1.1 joining

joining是用来连接字符串的流收集器.有三个重载方法.

// method definition
public static Collector<CharSequence, ?, String> joining();
public static Collector<CharSequence, ?, String> joining(CharSequence delimiter);
public static Collector<CharSequence, ?, String> joining(CharSequence delimiter,
                                                         CharSequence prefix,
                                                         CharSequence suffix);
// examples
String res1 = Stream.of("1", "2", "3", "4").collect(joining());             //1234
String res2 = Stream.of("1", "2", "3", "4").collect(joining(","));          //1,2,3,4
String res3 = Stream.of("1", "2", "3", "4").collect(joining(",", "{", "}"));//{1,2,3,4}

1.2 groupingBy

顾名思义,进行分组,也可以多级分组.多级分组是根据上一次分组的结果来进行分组.

// entity
public class Person {
    private String name;
    private int age;
    private double height;
}
// examples
// e1: group by age
Map<Integer, List<Person>> groupByAge = list.stream()
.collect(groupingBy(Person::getAge));
// e2: group by age and name
Map<Integer, Map<String, List<Person>>> groupByAgeAndName = list.stream()
.collect(groupingBy(Person::getAge, groupingBy(Person::getName)));
// e3: group by age , name and height
Map<Integer, Map<String, Map<Double, List<Person>>>> groupByAgeAndNameAndHeight = list.stream()
.collect(groupingBy(Person::getAge, groupingBy(Person::getName, groupingBy(Person::getHeight))));

1.3 partition

分区是分组的特殊情况,由一个谓词(返回一个布尔值的函数)作为分类函数.所以返回的Map集合只有两个key,一个true,一个false.

// is age greater than 20
Map<Boolean, List<Person>> isGT20 = list.stream().collect(partitioningBy(e -> e.getAge() > 20));
// is age greater than 20, and group by age
Map<Boolean, Map<Integer, List<Person>>> isGT20AndGroupByAge = list.stream().collect(partitioningBy(e -> e.getAge() > 20, groupingBy(Person::getAge)));

2 Custom Stream Collector

首先咱们看collect方法的定义:

<R, A> R collect(Collector<? super T, A, R> collector);

collect方法接受一个Collector子类对象.我们之前调的toList,groupingBy,partition等等都是Collectors中通过工厂方法创建的流收集器.所以如果我们需要创建自己的流收集器,只需要实现Collector接口即可.先看Collector接口的定义,以及解释其抽象方法的意思:

public interface Collector<T, A, R> {
    // 建立新的结果容器.也就是最终流元素进行处理之后的结果是存放在这个容器中的
    Supplier<A> supplier();

    // 将元素添加到结果容器中
    BiConsumer<A, T> accumulator();

    // 合并两个结果容器,使用parallelStream的时候会调用这个方法
    BinaryOperator<A> combiner();

    // 对结果容器应用最终转换,是累计过程中最后要调的一个函数,作用类似与Stream的map方法
    Function<A, R> finisher();

    // 返回一个不可变的Characteristic集合,它定义了收集器的行为
    // 尤其是关于流是否可以并行规约,以及使用哪些优化的提示
    Set<Characteristics> characteristics();
}

2.1 Example

现在我们需要实现对一个Person对象集合按年龄来分组,实现代码如下:

// define a custom collector
public class MyGrouping implements Collector<Person, Map<Integer, ArrayList<Person>>, Map<Integer, ArrayList<Person>>> {
    @Override
    public Supplier<Map<Integer, ArrayList<Person>>> supplier() {
        return HashMap::new;
    }

    @Override
    public BiConsumer<Map<Integer, ArrayList<Person>>, Person> accumulator() {
        return (map, p) -> {
            ArrayList<Person> list;
            if ((list = map.get(p.getAge())) != null) {
                list.add(p);
            } else {
                list = new ArrayList<>();
                list.add(p);
                map.put(p.getAge(), list);
            }
        };
    }

    @Override
    public BinaryOperator<Map<Integer, ArrayList<Person>>> combiner() {
        return (m1, m2) -> Stream.of(m1, m2)
                .map(Map::entrySet)
                .flatMap(Collection::stream)
                .collect(toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> {
                    e1.addAll(e2);
                    return e1;
                }));
    }

    @Override
    public Function<Map<Integer, ArrayList<Person>>, Map<Integer, ArrayList<Person>>> finisher() {
        return Function.identity();
    }

    @Override
    public Set<Characteristics> characteristics() {
        return Collections.unmodifiableSet(EnumSet.of(IDENTITY_FINISH, CONCURRENT));
    }
}

// how to use
Map<Integer, ArrayList<Person>> customGroupByAge = list.stream().collect(new MyGrouping());

3 Summary

collect是一个终端操作,接受一个收集器对象
收集器可以高效地复合起来,进行多级分组,分区和归约
可以实现Collector接口来实现自己的收集器