经典伴读_java8实战_Stream基础

原创已于 2022-07-01 11:54:16 修改 · 275 阅读

2 ·

CC 4.0 BY-SA版权

文章标签：

#java #mysql #数据库

于 2022-06-16 16:45:17 首次发布

经典伴读专栏收录该内容

10 篇文章

订阅专栏

本文深入探讨了Java8的Stream API，将其比喻为高级迭代器，介绍了如何通过Stream进行数据过滤、排序、映射和归约等操作。通过实例展示了SQL查询与Stream API的对比，解释了流的组成、中间操作链和终端操作。同时，讨论了数值流的使用，包括从数组、文件创建流以及如何构建有限和无限流。最后，提到了Stream的高级特性将在后续文章中进一步讲解。

经典伴读系列文章，想写的不是读书笔记，也不是读后感，自己的理解加上实际项目中运用，让大家3-4天读懂这本书

请添加图片描述

预备知识：文中多处出现方法引用，Lambda相关知识点，特别是函数描述符的概念和使用，不清楚的同学请先看《经典伴读_Java8实战_Lambda》

四、引入流

流是什么

流（Stream）简单理解就是一个高级迭代器，它封装了迭代细节，并且支持并发。（和输入输出流不是亲戚！）

问题：假设某家餐厅刚装修好，要准备菜单展示给顾客。数据库中有菜品表dish，对应实体Dish。现在要求查询出所有热量小于400的菜品名称，并按照热量从小到大排序。

方法1，使用数据库SQL直接查询

		SELECT name FROM dish WHERE calories < 400 ORDER BY calories

方法2，从数据库中查询出所有菜品后，在JAVA中处理数据。
（1）准备实体类和测试数据

    /**
     * 菜品
     */
    public static class Dish { 
        private int id;
        private String name; //菜品名称
        private boolean vegetarian; //是否是素菜
        private int calories; //热量
        private int type; //类型

        public Dish(int id, String name, boolean vegetarian, int calories, int type) {
            this.id = id;
            this.name = name;
            this.vegetarian = vegetarian;
            this.calories = calories;
            this.type = type;
        }

        public static class Type {
            public static final int FISH = 1;
            public static final int MEAT = 2;
            public static final int OTHER = 3;
        }

	public static List<Dish> findAllDishes() { //菜品测试数据
        List<Dish> dishes = Arrays.asList(
                new Dish(1,"port", false, 800, Dish.Type.FISH),
                new Dish(2,"beef", false, 700, Dish.Type.MEAT),
                new Dish(3,"chicken", false, 400, Dish.Type.MEAT),
                new Dish(4,"french firies", true, 530, Dish.Type.OTHER),
                new Dish(5,"rice", true, 800, Dish.Type.OTHER),
                new Dish(6,"season fruit", true, 120, Dish.Type.OTHER),
                new Dish(7,"pizza", true, 550, Dish.Type.OTHER),
                new Dish(8,"prawns", false, 800, Dish.Type.FISH),
                new Dish(9,"salmon", false, 800, Dish.Type.FISH));
        return dishes;
    }

（2）使用传统迭代

		//获取所有菜品
        List<Dish> dishes = findAllDishes();
        //过滤卡洛里400以下的菜品
        List<Dish> filterDishes = new ArrayList<>();
        for (Dish dish : dishes) {
            if (dish.getCalories() < 400) {
                filterDishes.add(dish);
            }
        }
        //按照卡洛里排序（减肥就要要吃热量小的菜）
        Collections.sort(filterDishes, new Comparator<Dish>() {
            @Override
            public int compare(Dish o1, Dish o2) {
                return Integer.compare(o1.getCalories(), o2.getCalories());
            }
        });
        //取出对应的菜名
        List<String> filterDishNames = new ArrayList<>();
        for (Dish dish : filterDishes) {
            filterDishNames.add(dish.getName());
        }
        System.out.println(filterDishNames);

SQL一句话能够实现的功能，JAVA的传统迭代却显得十分“啰嗦”，如果恰巧没了注释，那么看起来更不容易理解，归结起来这是因为SQL只需要声明它要什么样的数据，具体的迭代实现在MYSQL内部，而JAVA的传统迭代却要开发人员在应用中实现全部迭代细节（外部迭代）。值得庆幸的是现在JAVA也有了自己的内部迭代-流Stream（还不赶紧用起来）

（3）使用流Stream

		filterDishNames = dishes.stream()
                .filter(d -> d.getCalories() < 400) //对应WHERE
                .sorted(Comparator.comparingInt(Dish::getCalories)) //对应ORDER BY
                .map(Dish::getName) //对应SELECT
                .collect(toList());
        System.out.println(filterDishNames);

不用太在意细节，先去感受下Stream给我们带来的便捷。

流的组成

流 = 数据源 + 中间操作链 + 终端操作
（1）从上面的代码中可以看到使用流，首先要有数据源，如dishes这类集合调用stream方法后返回的Stream对象就是数据源。（可以类比数据库数据源）
（2）接着调用filter，sorted，map等返回Stream对象的方法统称为中间操作链，它们就像流预先定义好的函数，并不会立刻执行。
（3）最后像collect这类返回void或流的最终结果的方法统称终端操作（只要不返回Stream），调用它们时才会触发迭代，迭代的过程中触发各种中间操作。
（4）另外，Stream是一个单向遍历的迭代器，元素只能消费一次。如：

		List<Integer> ints = Arrays.asList(1,2,3,4,5);
        Stream<Integer> s =ints.stream();
        s.forEach(System.out::println);
        s.forEach(System.out::println); //报错stream has already been operated upon or closed

其中foreach是一个流的终端操作，用于迭代处理。参数是Consumer，对应的函数描述符是(T t) -> void 。

void forEach(Consumer<? super T> action);

五、使用流

如同前文一样，通过类比SQL学习流的使用是一种高效的办法。

筛选filter

问题：现在我们要做一个点菜小程序，分页展示菜品列表（一页5条），上拉加载下一页，页面选择只看“素菜”，则按照条件查询。

1、使用SQL

SELECT * FROM dish WHERE vegetarian=1 LIMIT 0,10;

2、使用Stream

		List<Dish> dishes = findAllDishes();
        List<Dish> filterDishes = dishes.stream()
                .filter(Dish::isVegetarian) 
                .skip(0).limit(5) 
                .collect(toList());
        System.out.println("第1页展示：" + filterDishes);

（1）filter是流的中间链操作，用于筛选，类比SQL中的WHERE，参数Predicate对应的函数描述符是(T t) -> boolean

Stream<T> filter(Predicate<? super T> predicate);

（2）skip，limit都是流的中间链操作，前者表示跳到哪个元素开始迭代，后者表示迭代几个元素之后截断流，skip(0).limit(5)类比SQL中的LIMIT 0, 5

Stream<T> skip(long n);
Stream<T> limit(long maxSize)

（3）collect(toList())是流的终端操作，可以先理解为将当前流中的元素收集到列表中返回。

映射map

问题：新开的餐馆经过一天的试营业，店长要统计当天所有素菜的销售额。
实际项目中通常少用连表操作，这就得先查出满足条件的菜品ids，然后在订单表中in产品id查询销售额并统计。而获取满足条件的业务id列表，就是实际项目中Stream最常用的场景。
1、使用SQL：

SELECT id FROM dish d WHERE d.vegetarian=1

2、使用Stream：

		 List<Integer> dishIds = dishes.stream()
                .filter(Dish::isVegetarian)
                .map(Dish::getId)
                .collect(toList());

map是流的中间链操作，对于流中的每一个元素应用函数映射成一个新元素，比SQL中的SELECT更加强大，可以映射成任意类型，如字符串"2022-6-15"可以映射为一个Date类型对象。参数Function对应的函数描述符是(T t) -> R 。

 <R> Stream<R> map(Function<? super T, ? extends R> mapper);

扁平映射flatmap

问题：需要展示TOP10低热量菜品排行榜，显示信息包括：菜品名称，产品类型，热量。
我们的菜品表dish中有菜品名称和热量，但只有产品类型的编码，具体的文字在菜品类型表dish_type中，于是我们用SQL连表查询。
1、使用SQL

SELECT d.name, dt.name, d.calories 
	FROM dish d, dish_type dt
	WHERE d.type = dt.id 
	ORDER BY d.calories;

2、使用Stream
上文说过实际项目中往往是单表操作，这时我们可以从菜品类型表dish_type中查出所有类型，放到缓存中，然后使用流将菜品和类型组装起来。先创建相关实体类和测试数据：

    /**
     * 菜品类型
     */
    public static class DishType {
        private int id;
        private String name;

    /**
     * 查询结果DTO
     */
    public static class DishDTO {
        private String name;
        private String typeName;
        private int calories;

        public DishDTO(String name, String typeName, int calories) {
            this.name = name;
            this.typeName = typeName;
            this.calories = calories;
        }

    public static List<DishType> findAllDishTypes() {//菜品类型测试数据
        List<DishType> dishTypes = Arrays.asList(
                new DishType(1, "fish"),
                new DishType(2, "meat"),
                new DishType(3, "other")
        );
        return dishTypes;
    }

使用Stream，将菜品列表dishes和类型列表dishTypes在内存中组合。

		List<DishType> dishTypes = findAllDishTypes();
        List<DishDTO> dishDTOS = dishes.stream()
                .flatMap(d -> dishTypes.stream() //嵌套流迭代
                        .filter(t -> d.type == t.id)  //相当于WHERE d.type = t.id
                        .map(t -> new DishDTO(d.name, t.name, d.calories))
                )
                .sorted(Comparator.comparing(DishDTO::getCalories)) //相当于ORDER BY
                .collect(toList());
        System.out.println(dishDTOS);

（1）flatMap是流的中间链操作，当流中的元素也是一个流时（嵌套），可以使用flatMap映射出嵌套流中的元素。如示例中flatMap()如果换成map，d -> dishTypes.stream()返回的是只是流对象而不会是其中的元素，这就像二维数组是由多个一位数组组成，现在需要将一维数组中所有的元素组合遍历。两层结构就变为了一层（后面讲数值流时有demo），这或许就是flat的由来。参数Function对应函数描述符(T t) -> Stream< R>，从这里也可以看出来我们需要的其实是(T t) -> R

<R> Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper);

（2）sorted是流的中间链操作，用于排序，参数Comparator对应的函数描述符是(T o1, T o2) -> int。

Stream<T> sorted(Comparator<? super T> comparator);

匹配与查找

1、匹配anyMatch

		boolean b = dishes.stream().anyMatch(Dish::isVegetarian);
        System.out.println("菜品中是否有素菜：" + b);
        b = dishes.stream().allMatch(Dish::isVegetarian);
        System.out.println("菜品中是否全是素菜：" + b);
        b = dishes.stream().noneMatch(Dish::isVegetarian);
        System.out.println("菜品中是否没有素菜：" + b);

anyMatch是流终端操作，用于判断流中是否包含元素，除此之外还有noneMatch与anyMatch，他们参数相同都是Predicate，对应的函数描述符是(t) -> boolean。

boolean anyMatch(Predicate<? super T> predicate);

2、查找findAny

		dishes.stream()
                .filter(Dish::isVegetarian)
                .findAny() //可以使用findFirst()
                .ifPresent(System.out::println);

findAny是流终端操作，用于查找满足条件的单个元素，它和findFirst作用类似，都没有参数，返回值是Optional，区别在于效率，findAny更适合并行，获取的到的元素可能不是第一个。
Optional是一个可以为空的容器对象，如果不为空，则调用ifPresent中的函数。参数Consumer对应的函数描述符是(T t) -> void。

public void ifPresent(Consumer<? super T> consumer)

归约reduce

在SQL常见的聚集函数，如sum，avg，max，min，count，它们将多行数据汇集成为一条，多用于统计。Stream中也有对应的方法，
1、求和

 		int sum = dishes.stream()
                .map(Dish::getCalories)
                .reduce(0, (temp, n) -> temp + n); 
        System.out.println("热量和：" + sum);

reduce是流的终端操作，用于将流中所有元素汇总成一个值，也称为归约，工作形式很像CPU的累加器，包含两个参数，identity是累加器的起始值，BinaryOperator的函数描述符是(T t1, T t2) -> T，其中t1是累加的中间变量temp，t2是遍历的元素n。

T reduce(T identity, BinaryOperator<T> accumulator);

可以使用静态方法引用简化代码：

reduce(0, Integer::sum);

2、求最大值，最小值

		dishes.stream()
                .map(Dish::getCalories)
                .reduce((temp, n) -> temp > n ? temp : n)
                .ifPresent(n -> System.out.println("最高热量：" + n));

reduce的单参数形式，没有初始值，参数是BinaryOperator，对应的函数描述符是(T t1, T t2) -> T。返回值是Optional（没有默认值，可能为空）。

Optional<T> reduce(BinaryOperator<T> accumulator);

可以使用静态方法引用简化代码：

reduce(Integer::max) //最大值
reduce(Integer::min) //最小值

3、理解reduce操作后，我们看下更加简洁的内置归约操作：sum()，count()，max()，min()，average()，它们和SQL的聚集函数一一对应。（其中average与其他操作不同，用收集操作实现，下一篇讲解）

		int sum = dishes.stream().mapToInt(Dish::getCalories).sum();
        System.out.println("sum:" + sum);
        long count = dishes.stream().count();
        System.out.println("count:" + count);
        dishes.stream().mapToInt(Dish::getCalories).max()
                .ifPresent(m -> System.out.println("max:" + m));
        dishes.stream().mapToInt(Dish::getCalories).min()
                .ifPresent(m -> System.out.println("min:" + m));
        dishes.stream().mapToInt(Dish::getCalories).average()
                .ifPresent(m -> System.out.println("avg:" + m));

mapToInt映射为int类型，可以理解为map操作 + 转型。参数ToIntFunction对应的函数描述符是(T t) -> int。

IntStream mapToInt(ToIntFunction<? super T> mapper);

数值流

原始数据类型对应的流就是数值流，如int-IntStream，long-LongStream，double-DoubleStream，当我们做数学计算时使用数值流，如上面IntStream的sum()方法，

1、数值流和对象流转换
int和Integer之间可以通过自动装箱拆箱互转，但int[] 和 Integer[]，int[]和List< Integer>之间可不那么容易。

		int[] intArr;
        Integer[] integerArr;
        List<Integer> integerList;

        intArr = new int[]{1, 2, 3, 4, 5};
        integerArr = Arrays.stream(intArr).boxed().toArray(Integer[]::new);
        System.out.println("int[]转Integer[]：" + Arrays.toString(integerArr));

        integerArr = new Integer[]{1, 2, 3, 4, 5};
        intArr = Arrays.stream(integerArr).mapToInt(d -> d).toArray();
        System.out.println("integer[]转int[]：" + Arrays.toString(intArr));

        intArr = new int[]{1, 2, 3, 4, 5};
        integerList = Arrays.stream(intArr).boxed().collect(toList());
        System.out.println("int[]转List<Integer>：" + integerList);

        integerList = Arrays.asList(1,2,3,4,5);
        int[] arr = integerList.stream().mapToInt(d -> d).toArray();
        System.out.println("List<Integer>转int[]：" + Arrays.toString(arr));

（1）Arrays.stream方法将数组转为流，不同类型的数组有不同的重载方法。

public static IntStream stream(int[] array) 
public static DoubleStream stream(double[] array) 
public static LongStream stream(long[] array)
public static <T> Stream<T> stream(T[] array)

（2）每一种数值流对应的boxed方法，将数值流转为对象流，类似装箱。（mapToObj方法也可以）

Stream<Integer> boxed() //IntStream
Stream<Long> boxed() //LongStream
Stream<Double> boxed() //DoubleStream

（3）Stream提供数值映射方法，将对象流转回数值，类似拆箱。

IntStream mapToInt(ToIntFunction<? super T> mapper)
LongStream mapToLong(ToLongFunction<? super T> mapper)
DoubleStream mapToDouble(ToDoubleFunction<? super T> mapper)

（4）每一种数值流对应的toArray方法，将数值流转回数组。

int[] toArray() //IntStream
long[] toArray() //LongStream
double[] toArray() //DoubleStream

（5）Stream也提供一个带参数的toArray方法，将对象流转回数组，有趣的是参数是IntFunction，而不是IntSupplier，对应的函数描述符是(int a) -> R，现在传入的是Integer[]构造方法的函数引用，那么参数a是什么？看了参数说明学到了，原来是数组的大小。

<A> A[] toArray(IntFunction<A[]> generator)

2、有了数值流的知识，回头看下扁平映射flatMap，是否会更加清晰了，将二维数组中的多个一维数组扁平化组合为一维数组遍历。

		int[][] arrs = {{1}, {1, 2}, {1,2,3}};
        int[] arr = Arrays.stream(arrs)
                .flatMapToInt(a -> Arrays.stream(a))
                .toArray();
        System.out.println(Arrays.toString(arr));

构建流

1、可以由值、数组、文件创建流

		//由值创建流
        Stream s = Stream.of(1,2,2,3);
        s = Stream.empty(); //空流

        //由数组创建流
        IntStream is = Arrays.stream(new int[]{1,2,3,4});

        //文件创建
        long letterCount = 0; //统计文章字数
        try (Stream<String> lines = Files.lines(Paths.get("datas.txt"), Charset.forName("UTF-8"))){
            letterCount = lines.flatMap(line ->
                    Arrays.stream(line.replace(" ", "").split("")))
                    .count();
            System.out.println("总字数：" + letterCount);
        } catch (IOException e) {
            e.printStackTrace();
        }

2、也可以由函数创建流

		//有限流，根据数值范围生成流
        is = IntStream.range(1, 5); //不包含结束值，1,2,3,4
        is = IntStream.rangeClosed(1, 5); //包含结束值，1,2,3,4,5
        //无线流，需要使用limit限制数量
        s = Stream.iterate(0, n -> n + 2).limit(5); //0,2,4,6,8
        Random rand = new Random();
        Stream.generate(() -> rand.nextInt(10)).limit(5);

（1）Stream.iterate将前一个元素作为参数，调用迭代函数生成当前元素。有两个参数，迭代初始值和迭代函数，其中UnaryOperator对应的函数描述符是(T t) -> T。

public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f)

（2）Stream.generate则简单的多，没有任何参数，自定义创建流元素，参数Supplier为生成函数，对应的函数描述符是()->T。

public static<T> Stream<T> generate(Supplier<T> s)

（3）生成斐波那契数列
要生成斐波那契数列，无论用哪种方法，关键是记住当前元素的前两个值。Stream.iterate天然支持，Stream.generate则需要自己想办法，Lambda表达式是函数无法暂存迭代的中间变量，但是匿名内部类的成员变量却可以。

		//数组第1个值表示当前元素，第2个值表示下一个元素（中间变量）
        Stream.iterate(new int[]{0, 1}, t -> new int[]{t[1],t[0] + t[1]}) /
                .limit(10)
                .map(t -> t[0]) //去掉中间变量
                .forEach(System.out::println);
                
		//() -> int无法保存中间变量，因此采用匿名内部类
        IntStream.generate(new IntSupplier() { 
            private int first = 0; //中间变量
            private int sencond = 1; //中间变量
            @Override
            public int getAsInt() {
                int current = this.first;
                this.first = this.sencond;
                this.sencond = current + this.sencond;
                return current;
            }
        }).limit(10).forEach(System.out::println);