NIO基础（一）之Buffer

NIO与Buffer详解

最新推荐文章于 2024-08-30 10:40:01 发布

原创最新推荐文章于 2024-08-30 10:40:01 发布 · 475 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#NIO

网络编程与Netty 同时被 2 个专栏收录

17 篇文章

订阅专栏

9 篇文章

订阅专栏

本文深入探讨Java NIO（New IO或NonBlocking IO）的概念，介绍Buffer作为内存块的封装，解析其核心方法如flip(), clear()等的功能及源码实现。通过示例展示Buffer在读写模式间的转换，讲解直接缓冲与间接缓冲的区别，以及ByteBuffer的特性和使用。

NIO简介

Java NIO( New IO 或者 Non Blocking IO ) ，从 Java 1.4 版本开始引入的基于缓冲区( Buffer )的非阻塞 IO 。

Buffer 简介

一个 Buffer ，本质上是内存中的一块，我们可以将数据写入这块内存，之后从这块内存获取数据。通过将这块内存封装成 NIO Buffer 对象，并提供了一组常用的方法，方便我们对该块内存的读写。

Buffer 在 java.nio 包中实现，被定义成抽象类，从而实现一组常用的方法。整体类图如下：
在这里插入图片描述

一个小示例

下面这个示例能把一个文件的内容写到另一个文件里去，这里使用的是ByteBuffer :

package NIO;

import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class ChannelCopy {
    public static final int BSIZE = 1024;

    public static void main(String[] args) throws IOException {
        if(args.length != 2){
            System.out.println("arguments: sourcefile destfile");
            System.exit(1);
        }
        FileChannel in = new FileInputStream(args[0]).getChannel(),
                out = new FileOutputStream(args[1]).getChannel();
        ByteBuffer buffer = ByteBuffer.allocate(BSIZE);
        while (in.read(buffer) != -1){
            //in调用read()后，数据输入到buffer，buffer内部指针位置改变，
            // buffer就要调用flip(),对内部指针重新安排,以便out.write()提取。
            buffer.flip();
            out.write(buffer);
            //out调用write()后，数据仍在buffer中，buffer内部指针位置改变，
            //clear()对内部指针重新安排，以便buffer在另一个in.read()操作期间能够做好接收数据的准备。
            buffer.clear();
        }
        in.close();
        out.close();
    }
}

使用Idea的小伙伴可能还不知道怎么传入参数：

运行结果（下面还有，没有显示出来）：

这里包含了Buffer类里的3个基础的方法，先简单介绍一下：

allocate()：给Buffer分配大小空间。
flip()：当有channel用了read()方法之后，buffer需要flip()一下，为之后的write()做准备。
clear()：当有channel用了write()方法之后，buffer需要clear()一下，为之后的read()做准备。

那么有人会有疑问，我们每次读完一个buffer里的内容都要让它进行flip吗？
答案是的。这可能看起来十分愚蠢，为什么buffer不自己进行内部指针的维护呢？我想大概是为了buffer实现的过程更加简单，让buffer的运用更加灵活。
下面就让我们以ByteBuffer为例子具体看看flip()和clear()具体是如何工作的。

源码解析

以下的部分是Buffer类源码片段：

package java.nio;

import java.util.Spliterator;

可以看出Buffer类中导入了jdk1.8中的Spliterator，说明Buffer虽然是jdk1.4开始的，但后来肯定重写过。

public abstract class Buffer {

    /**
     * The characteristics of Spliterators that traverse and split elements
     * maintained in Buffers.
     */
    static final int SPLITERATOR_CHARACTERISTICS =
        Spliterator.SIZED | Spliterator.SUBSIZED | Spliterator.ORDERED;

    // Invariants: mark <= position <= limit <= capacity
    private int mark = -1;
    private int position = 0;
    private int limit;
    private int capacity;

这里有4个非常重要的属性，而且源码给出了它们之间不可变的关系：

mark <= position <= limit <= capacity

capacity ：容量，Buffer 能容纳的数据元素的最大值。这一容量在 Buffer 创建时被赋值，并且永远不能被修改（没有修改它的方法）。
position ：位置，初始值为 0 。代表的写/读的下一个位置。
- 写模式下，每往 Buffer 中写入一个值，position 就自动加 1 ，代表下一次的写入位置。
- 读模式下，每从 Buffer 中读取一个值，position 就自动加 1 ，代表下一次的读取位置。
limit ：能读/写的上限，方便我们从buffer输出数据的时候，判断是否输出完。
mark ：标记，通过 mark() 方法，记录当前 position ；通过 reset() 方法，恢复 position 为标记。

它们在读/写模式的大小改变如下（可以把这几个属性理解为指针，mark我们先不管，示例中也没用到）：

当我们通过

ByteBuffer buffer = ByteBuffer.allocate(BSIZE);

创建了一个buffer对象后，limit = capacity = BSIZE，position = 0。当然BSIZE这个大小的索引是没有的，这里让limit和capacity指向一个位置是为了让我们便于理解。

position 永远指向的是读/写的下一个位置，此时 buffer 为空，所以需要它指向0，我们从0位置开始写数据。每往 Buffer 中写入一个值，position 就自动加 1。

我们每次往buffer里写完数据后要调用 flip() 方法，那么 flip() 方法做了些什么呢？

flip()

public final Buffer flip() {
        limit = position;
        position = 0;
        mark = -1;
        return this;
    }

可以看出来：我们把 limit 指向了 position ，这样我们就知道了我们储存数据的终点。然后再把 position 指向 0，这样我们就知道了我们储存数据的起点。在我给出的示例中，buffer 除了最后一次写过程可能没有占用所有的容量，其它的时候都是占用了全部容量。

public final int remaining() {
        return limit - position;
    }
 public final boolean hasRemaining() {
        return position < limit;
    }

现在，看一下buffer的这两个函数，就很好理解了。remaining() 返回的就是还剩多少数据，hasRemaining()返回的就是是否还有数据。

现在buffer里有数据了，程序就可以调用 FileChannel 的 write() 方法，读buffer的内容往外写，这时buffer 的状态是读状态。
经过 flip() 操作后，程序可以轻松的找到可读片段的头和尾进行操作。buffer 读过程完毕后，position 的指向与 limit 重合，此时为了返回到写过程的初始状态，需要将 position 置 0，limit 置 capacity。这也正是 clear() 的操作：

clear()

public final Buffer clear() {
        position = 0;
        limit = capacity;
        mark = -1;
        return this;
    }

其他几个重要的方法

rewind()

public final Buffer rewind() {
    position = 0; 
    mark = -1; 
    return this;
}

读模式状态下的重读功能。

mark()

public final Buffer mark() {
    mark = position;
    return this;
}

保存当前 position 到 mark；

reset()

public final Buffer reset() {
    int m = mark;
    if (m < 0)
        throw new InvalidMarkException();
    position = m;
    return this;
}

必须在mark()之后使用，将 position 置为 mark。

直接缓冲与间接缓冲

 // Used only by direct buffers
    // NOTE: hoisted here for speed in JNI GetDirectBufferAddress
    long address;

Buffer类里有个属性是 address ，可以看到注释里写的是：仅仅被用在直接缓冲。能提升速度。
直接缓冲 （Direct Buffer）：

所分配的内存不在 JVM 堆上, 不受 GC 的管理.(但是 Direct Buffer 的 Java 对象是由 GC 管理的, 因此当发生 GC, 对象被回收时, Direct Buffer 也会被释放)。
使用 Direct Buffer 时, 当进行一些底层的系统 IO 操作时, 效率会比较高, 因为此时 JVM 不需要拷贝 buffer 中的内存到中间临时缓冲区中。
Direct Buffer 不在 JVM 堆上分配, 因此 Direct Buffer 对应用程序的内存占用的影响就不那么明显(实际上还是占用了这么多内存, 但是 JVM 不好统计到非 JVM 管理的内存，如果内存泄漏，那么很难排查 )。
申请和释放 Direct Buffer 的开销比较大。因此正确的使用 Direct Buffer 的方式是在初始化时申请一个 Buffer, 然后不断复用此 buffer, 在程序结束后才释放此 buffer。

间接缓冲（Non-Direct Buffer）：

直接在 JVM 堆上进行内存的分配, 本质上是 byte[ ] 数组的封装.
因为 Non-Direct Buffer 在 JVM 堆中, 因此当进行操作系统底层 IO 操作中时, 会将此 buffer 的内存复制到中间临时缓冲区中. 因此 Non-Direct Buffer 的效率就较低

注：Java内存分配：

堆上：由JVM控制，用来存储对象。
堆外：由操作系统处理。不适合储存复杂的对象。申请空间耗费更高的性能。IO读写的性能更好。

拓展

ByteBuffer

ByteBuffer 是 Buffer 里最重要的一个子类，如果你还想了解更多有关ByteBuffer的内容可以继续看下去。

public abstract class ByteBuffer
    extends Buffer
    implements Comparable<ByteBuffer>
{
    final byte[] hb;                  // Non-null only for heap buffers
    final int offset;
    boolean isReadOnly;

ByteBuffer 是个抽象类，示例中是通过 allocate 方法实例化的。可以看到 ByteBuffer 底层是数组。offset 是偏移量，用于在数组指定位置进行一系列操作。

public static ByteBuffer allocate(int capacity) {
        if (capacity < 0)
            throw new IllegalArgumentException();
        return new HeapByteBuffer(capacity, capacity);
    }

返回的是一个 HeapByteBuffer 。

class HeapByteBuffer
    extends ByteBuffer
{

    protected final byte[] hb;
    protected final int offset;

    HeapByteBuffer(int cap, int lim) {            // package-private

        super(-1, 0, lim, cap, new byte[cap], 0);
     
    }

new HeapByteBuffer(capacity, capacity) 能初始化 capacity 大小并将 limit 指向 capacity。

public abstract ByteBuffer put(byte b);
public abstract byte get(int index);

put 方法能往 ByteBuffer 中传入字节。
get(int) 方法能得到对应索引位置的字节。
get(byte[] dst)方法将缓冲区可读字节数组复制到数组中
在这里插入图片描述
类型化 put 和类型化 get。底层还是对 byte 进行操作。注意要先 putXX 就要先 getXX。

import java.nio.ByteBuffer;
public class BufferTest {
    public static void main(String[] args) {
        ByteBuffer byteBuffer=ByteBuffer.allocate(10);
        for (int i=0;i<byteBuffer.capacity();i++){
            byteBuffer.put((byte) i);
        }
        //设置postition和limit位置
        byteBuffer.position(2);
        byteBuffer.limit(6);

        ByteBuffer sliceBufeer=byteBuffer.slice();
        byteBuffer.clear();
        for (int i=0;i<sliceBufeer.capacity();i++){
            sliceBufeer.put(i, (byte) (sliceBufeer.get(i)*2));
        }
        while (byteBuffer.hasRemaining()){
            System.out.print(byteBuffer.get()+“  ”);
        }
    }
}

来看一个例子了解一下 slice() ，结果：

0  1  4  6  8  10  6  7  8  9

通过 slice() 获得 buffer 在 [ position, limit ) 之间的部分。相当于一个浅拷贝，他们共用一个数组，但是索引互不相干。

只读Buffer

 ByteBuffer onlyReadBuffer = ByteBuffer.allocate(10).asReadOnlyBuffer();
 System.out.println(onlyReadBuffer.getClass());
//class java.nio.HeapByteBufferR

只读Buffer 的特点是类名后有一个R。这种Buffer不能被写。

public ByteBuffer put(byte x) {
        throw new ReadOnlyBufferException();
    }

如果你对只读Buffer 进行写操作，会直接抛出异常。