Netty源码之ByteBuf(4.1.44)

最新推荐文章于 2025-03-06 16:14:24 发布

原创最新推荐文章于 2025-03-06 16:14:24 发布

· 583 阅读

0 ·

版权

文章标签：

#java #netty

Netty 专栏收录该内容

8 篇文章

订阅专栏

Netty源码之ByteBuf(4.1.44)

Netty 抛弃了 Java NIO 的 ByteBuffer，因为它功能有限且使用过于复杂。于是乎，Netty 自己弄出了一个类似 ByteBuffer 的数据容器，称之为 ByteBuf，并且提供很多非常酷的特性:

容量可按需动态扩展。
读写采用不同的指针，并随意切换，不需要用户额外调用 flip() 切换读写模式。
通过内置的复合缓冲类型实现零拷贝技术。
支持引用计数。
支持缓存池。
类型丰富，支持堆外内存和堆内内存。

继承关系图

ByteBufAPI

图中所框选的是在 Netty 比较常用的几类 ByteBuf 对象。通过名称可知晓它们的功能。

ByteBuf 实现 ReferenceCounted 和 Comparable 两个接口，分别具有引用计数和两个 ByteBuf 比较的能力。
ByteBuf 是一个抽象类，但完全可以定义成接口，但是 Netty 官方称定义为抽象类比定义为接口性能高一点。里面绝大部分都是抽象方法，由一系列操作 ByteBuf 的API 组成，下面会详细讲解。
AbstractByteBuf 是重要的抽象类，它是实现一个 Buffer 的骨架。重写了绝大部分 ByteBuf 抽象类的抽象方法，封装了 ByteBuf 操作的共同逻辑，比如在获取数据前检查索引是否有效等等。在 AbstractByteBuf 中定义了两个重要的指针变量简化用户降低 ByteBuf 的难度:
- readerIndex
- writerIndex
AbstractReferenceCountedByteBuf 也是重要的抽象类，主要实现了引用计数相关逻辑。内部维护一个 volatile int refCnt 变量。但是所有对 refCnt 的操作都需要通过 ReferenceCountUpdater 实例。AbstractReferenceCountedByteBuf 的子类实现可太多了。重要的有
- PooledByteBuf: 抽象类，它是拥有池化能力的 ByteBuf 的骨架，内部定义了许多关于池化相关的类和变量。这个后续会详细讲解。
- UnpooledHeapByteBuf: 实现类，非池化堆内内存ByteBuf。内部使用 byte[] array 字节数组存储数据。底层使用 HeapByteBufUtil 静态方法完成对数组的操作。为了提高性能，使用位移方式处理数据，值得好好体会。
- CompositeByteBuf: 实现类，可组合的ByteBuf。底层实现通过内部类 Component 包装 ByteBuf，然后使用 Component[] 存储多个 Component 对象从而实现组合模式。
- UnpooledDirectByteBuf: 实现类，非池化堆外ByteBuf。底层是持有 java.nio.ByteBuffer 对象的引用。
- FixedCompositeByteBuf: 实现类，固定的可组合ByteBuf。允许以只读模式包装 ByteBuf 数组。

以上简单描述了 ByteBuf 继承体系，并没有涉及到太多的细节，仅让读者从大局观了解 ByteBuf 设计理念。
记住: 先脉络，后细节。

ReferenceCounted

定义和引用计数相关的接口。API 相当简单，一看就懂的那种。方法的实现一般是在抽象类 io.netty.util.AbstractReferenceCounted 中完成。至于具体如何实现，下面在接着聊。

ByteBuf

ByteBuf 是一个非常非常重要的抽象类，在 Netty 生态中举足轻重。我们知道，Netty 是一款高性能的网络框架，用于接收/发送数据。而接收数据的容器就是 ByteBuf，它是 Netty 实现高性能网络框架的重要的一环。并非 java.nio.ByteBuffer 不能直接使用，但是它编程相对复杂且功能比较弱鸡，而 ByteBuf 拥有丰富的 API 而且简单易用。

文档翻译

A random and sequential accessible sequence of zero or more bytes (octets). This interface provides an abstract view for one or more primitive byte arrays (byte[]) and NIO buffers.

一个由零个或多个字节(八位字节)组成的随机且顺序可访问序列。此接口为一个或多个基本字节数组(byte [])和 NIO 缓冲区提供了抽象视图。
创建一个 buffer

推荐使用 分配器(Allocator) 而非使用单独的类的构造器构造一个 Buffer 对象。

随机访问索引

就像普通的数组一样可进行随机访问。索引从 0 开始，最后一个字节为 capacity-1。

顺序访问索引

该能力是基于两个指针 readerIndex 和 writerIndex。它们将字节数组分成三个区域（变量 capacity 表示该缓冲区的容量大小）

读取数据

任何名称以 read 或 skip 开头的操作都将获得或跳过当前 readerIndex 中的数据而且 readerIndex 会增加读取字节的数量值。如果读操作的参数也是 ByteBuf 实例且没有指定目标索引(destination index)，则指定 ByteBuf 的 writerIndex 将一起增加。
没有剩余的空间不够了，则会抛出 IndexOutOfBoundsException 异常。
新创建的、被包装的（wrapped）、复制的 Buffer 的 readerIndex 值为 0。

写入数据

任何名称以 write 开头的操作都将在当前 writerIndex 上写入数据，并增加写入字节的数量。如果写入的参数也是 ByteBuf 实例，并且没有指定源索引，则该参数的 ByteBuf 对象的 readerIndex 将一起增加。
没有剩余的空间不够了，则会抛出 IndexOutOfBoundsException 异常。
新创建的 writerIndex 的值为 0。
被包装的（wrapped）、复制的 Buffer 的 writerIndex 值为当前 ByteBuf 的 capacity 的值。

丢弃数据

丢弃已经被读取过的内容。
通过 discardReadBytes() 开垦新的可用的数据存储空间。

在这里插入图片描述

请注意

当调用 discardReadBytes() 之后，不能保证可写字节的内容。可写字节在大部分的情况下不会被移动，甚至可能被完全不同的数据填充，这取决于底层 buffer 的实现。

清除 buffer 索引

clear() 方法会将 readerIndex 、writerIndex 置为 0。并不会清除 buffer 中的内容，仅仅是重置两个指针。

在这里插入图片描述

搜索索引

简单的单字节搜索。
- indexOf(int, int, byte)
- bytesBefore(int, int, byte)
对 NUL-terminated 字符串特别有用
- bytesBefore(byte)
功能强大的搜索
- forEachByte(int, int, ByteProcessor)

标记/重置

每个缓冲区中有两个标记索引。分别是 writerIndex、readerIndex。
您总是可以通过调用 reset 方法来重新定位两个索引中的一个。

衍生 Buffer

通过调用以下方法为当前 Buffer 创建一个视图:
- duplicate()
- slice()
- slice(int, int)
- readSlice(int)
- retainedDuplicate()
- retainedSlice()
- retainedSlice(int, int)
- readRetainedSlice(int)
派生的 Buffer 各类指针是独立的。但共享缓冲区的数据。类似一个 NIO 缓冲区。
如果需要全新的 Buffer，请调用 copy() 方法。

未保留和保留的派生 Buffer

我们知道，Buffer 是存在引用计数的，当我们调用诸如 duplicate()、slice()、slice(int, int) 和 readSlice(int) 方法时并不会调用 retain() 增加引用计数值（retain 有维持的意思，底层是让引用计数 +1，不让其被释放）。如果你需要维持对这个 Buffer 的引用，不让它被释放，可以考虑使用 retainedDuplicate()、retainedSlice()、retainedSlice(int, int) 和 readRetainedSlice(int) ，它们会返回一个产生较少垃圾的 Buffer。

Byte[]

如果一个 Buffer 存在支撑数组，则可以通过 array() 方法直接访问它。
hasArray() 方法会判断当前 Buffer 是否存在 支撑数组。调用 discardReadBytes() 之后

NIO Buffers

如果一个 ByteBuf 可以转换为 NIO ByteBuffer 对象，这个对象共享内容。你可以通过 nioBuffer() 方法获得该 NIO ByteBuffer 对象。
通过 nioBufferCount() 方法判断是否可以将 Buffer 转换为 NIO ByteBuffer。

I/O 流

ByteBufInputStream
ByteBufOutputStream

我们大致了解了 ByteBuf 相关的 API，说实话，这也太多了。他和 java.nio.Buffer 设计理想不一样。

Netty 的 ByteBuf 是将所有的操作（API）都集成在一起，比如 getBoolean() 、getByte()、getLong() 等等。
而 Java.nio.Buffer 则通过多个类达到解耦的目的。比如对于基本类型 Byte ，首先使用 java.nio.ByteBuffer 对象继承 Buffer，然后再定义与 Byte 相关的抽象方法给子类实现。至于哪种好，那就见仁见智了，Java 使用了良好的设计模式解耦各个 Buffer 的功能，但是也存在的大量的类。而 Netty 则是更为紧凑，使用 ByteBuf 来统一其他数据类型（比如 int、long），可能也是觉得后者都是以 Byte 字节为最小单元组合而成的。因此，才会这么设计 API 吧。

在这里插入图片描述

记录部分 ByteBuf API 使用说明

// 立即「丢弃」所有已读数据（需要做数据拷贝，将未读内容复制到最前面）
// 即便只有一个字节剩余可写，也执行「丢弃动作」
public abstract ByteBuf discardReadBytes();

// 会判断 readerIndex 指针是否超过了 capacity的一半
// 如果超过了就执行「丢弃」动作
// 这个方法相比 discardReadBytes() 智能一点
public abstract ByteBuf discardSomeReadBytes();

// 确保 minWritableBytes 字节数可写
// 如果容量不够的话，会触发「扩容」动作
// 扩容后的容量范围[64Byte, 4MB]
public abstract ByteBuf ensureWritable(int minWritableBytes);

// 返回一个int类型的值
// 0: 当前ByteBuf有足够可写容量，capacity保持不变
// 1: 当前ByteBuf没有足够可写容量，capacity保持不变
// 2: 当前ByteBuf有足够的可写容量，capacity增加
// 3: 当前ByteBuf没有足够的可写容量，但capacity已增长到最大值
public abstract int ensureWritable(int minWritableBytes, boolean force);

/**
 * 通过set/get方法还是需要将底层数据看成一个个由byte组成的数组，
 * 索引值是根据基本类型长度而增长的。
 * set/get 并不会改变readerIndex和writerIndex的值，
 * 你可以理解为对某个位进行更改操作
 * 至于大端小端，这个根据特定需求选择的。现阶段的我对这个理解不是特别深刻
 */
public abstract int   getInt(int index);
public abstract int   getIntLE(int index);

 * 方法getBytes(int, ByteBuf, int, int)也能实现同样的功能。
 * 两者的区别是:
 * 	   「当前方法」会增加目标Buffer对象的「writerIndex」的值，
 *     getBytes(int, ByteBuf, int, int)方法不会更改。

/**
 * 从指定的绝对索引处开始，将此缓冲区的数据传输到指定的目标Buffer对象，直到目标对象变为不可写。
 * 
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 增加「writerIndex」 
 * 
 * @param index  索引值
 * @param dst    目标对象
 * @return       源对象
 */
public abstract ByteBuf getBytes(int index, ByteBuf dst);

/**
 * 从指定的绝对索引处开始，将此缓冲区的数据传输到指定的目标Buffer对象，传输长度为length
 * 方法getBytes(int, ByteBuf, int, int)也能实现同样的功能。
 * 
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 增加「writerIndex」 
 * @param index  索引值
 * @param dst    目标对象
 * @param length 拷贝长度
 * @return       源对象
 */
public abstract ByteBuf getBytes(int index, ByteBuf dst, int length);

/**
 * 把数据拷贝到目标数组中
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 无
 */
public abstract ByteBuf getBytes(int index, byte[] dst);

/**
 * 把数据拷贝到目标数组中
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 无
 */
public abstract ByteBuf getBytes(int index, byte[] dst, int dstIndex, int length);

/**
 * 把数据拷贝到目标数组中
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 增加「writerIndex」
 */
public abstract ByteBuf getBytes(int index, ByteBuffer dst);

/**
 * 把数据拷贝到目标对象
 * 以上关于将数据复制给ByteBuf对象的方法最终还是调用此方法进行数据复制
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 都不修改
 */
public abstract ByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length);

/**
 * 把数据拷贝到目标流中
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 无
 */
public abstract ByteBuf getBytes(int index, OutputStream out, int length) throws IOException;

/**
 * 把数据拷贝到指定通道
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 无
 */
public abstract int getBytes(int index, GatheringByteChannel out, int length) throws IOException;

/**
 * 把数据拷贝到指定通道，不会修改通道的「position」
 *
 * 「writerIndex」 「readerIndex」
 *   数据源: 都不修改
 * 目标对象: 无
 */
public abstract int getBytes(int index, FileChannel out, long position, int length) throws IOException;

/**
 * 把对象src的 「可读数据(writerIndex-readerIndex)」 拷贝到this.ByteBuf对象中
 * 剩下的参数凡是带有ByteBuf对象的，都和这个处理逻辑类似。
 * 但是setBytes(int index, ByteBuf src, int srcIndex, int length)这个方法就有点与众不同
 * 这个方法都不会修改这两个指针变量的值。
 * 
 * 「writerIndex」 「readerIndex」
 *    src: 增加「readerIndex」的值
 *   this: 都不修改
 */
public abstract ByteBuf setBytes(int index, ByteBuf src);
public abstract ByteBuf setBytes(int index, ByteBuf src, int length);
public abstract ByteBuf setBytes(int index, ByteBuf src, int srcIndex, int length);
public abstract ByteBuf setBytes(int index, byte[] src);
public abstract ByteBuf setBytes(int index, byte[] src, int srcIndex, int length);
public abstract ByteBuf setBytes(int index, ByteBuffer src);
public abstract int setBytes(int index, InputStream in, int length) throws IOException;
public abstract int setBytes(int index, ScatteringByteChannel in, int length) throws IOException;
public abstract int setBytes(int index, FileChannel in, long position, int length) throws IOException;
// 使用 NUL(0x00)填充
public abstract ByteBuf setZero(int index, int length);

/**
 * 以下是read操作
 * readerIndex 会按照对应类型增长。
 * 比如readByte()对应readerIndex+1，readShort()对应readerIndex+2
 */
public abstract byte  readByte();
public abstract short readShort();
public abstract short readShortLE();
public abstract int   readUnsignedShort();
public abstract int   readUnsignedShortLE();
public abstract int   readMedium();

/**
 * 从当前的 readerIndex 开始，将这个缓冲区的数据传输到一个新创建的缓冲区，
 * 并通过传输的字节数(length)增加 readerIndex。
 * 返回的缓冲区的 readerIndex 和 writerIndex 分别为0 和 length。
 *
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf readBytes(int length);

/**
 * 返回一个新的ByteBuf对象。它是一个包装对象，里面有一个指向源Buffer的引用。
 * 该对象只是一个视图，只不过有几个指针独立源Buffer
 * 但是readerIndex(0)和writerIndex(=length)的值是初始的。
 * 另外，需要注意的是当前方法并不会调用 retain()去增加引用计数
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf readSlice(int length);
public abstract ByteBuf readRetainedSlice(int length);


/**
 * 读取数据到 dst，直到不可读为止。
 *
 * 「writerIndex」 「readerIndex」
 *    dst: 增加「writerIndex」的值
 *   this: 增加「readerIndex」
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf readBytes(ByteBuf dst);
public abstract ByteBuf readBytes(ByteBuf dst, int length);

/**
 * 读取数据到 dst，直到不可读为止。
 *
 * 「writerIndex」 「readerIndex」
 *    dst: 都不修改
 *   this: 都不修改
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf readBytes(ByteBuf dst, int dstIndex, int length);

public abstract CharSequence readCharSequence(int length, Charset charset);
public abstract int readBytes(FileChannel out, long position, int length) throws IOException;
public abstract ByteBuf skipBytes(int length);


/**
 * 写入下标为 writerIndex 指向的内存。
 * 如果容量不够，会尝试扩容
 *
 * 「writerIndex」 「readerIndex」
 *    dst: 无
 *   this: 「writerIndex」 + 1
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf writeByte(int value);

/**
 * 写入下标为 writerIndex 指向的内存。
 * 如果容量不够，会尝试扩容
 *
 * 「writerIndex」 「readerIndex」
 *    dst: 无
 *   this: 「writerIndex」 + 1
 * @return 一个新创建的ByteBuf对象
 */
public abstract ByteBuf writeBytes(ByteBuf src);
public abstract ByteBuf writeBytes(ByteBuf src, int length);
public abstract ByteBuf writeBytes(ByteBuf src, int srcIndex, int length);
public abstract ByteBuf writeBytes(byte[] src);
public abstract ByteBuf writeBytes(byte[] src, int srcIndex, int length);
public abstract ByteBuf writeBytes(ByteBuffer src);
public abstract int writeBytes(FileChannel in, long position, int length) throws IOException;
public abstract ByteBuf writeZero(int length);
public abstract int writeCharSequence(CharSequence sequence, Charset charset);

/**
 * 从「fromIndex」到「toIndex」查找value并返回索引值
 * @return 首次出现的位置索引，-1表示未找到
 */
public abstract int indexOf(int fromIndex, int toIndex, byte value);

/**
 * 定位此缓冲区中指定值的第一个匹配项。
 * 搜索范围[readerIndex, writerIndex)。
 * 
 * @return -1表示未找到
 */
public abstract int bytesBefore(byte value);

/**
 * 搜索范围[readerIndex，readerIndex + length)
 *
 * @return -1表示未找到
 *
 * @throws IndexOutOfBoundsException
 */
public abstract int bytesBefore(int length, byte value);

/**
 * 搜索范围[index, idnex+length)
 *
 * @return -1表示未找到
 *
 * @throws IndexOutOfBoundsException
 */
public abstract int bytesBefore(int index, int length, byte value);

/**
 * 使用指定的处理器按升序迭代该缓冲区的「可读字节」
 *
 * @return -1表示未找到; 如果ByteProcessor.process(byte)返回false，则返回上次访问的索引值
 */
public abstract int forEachByte(ByteProcessor processor);

/**
 * 迭代范围[index, index+length-1)
 */
public abstract int forEachByte(int index, int length, ByteProcessor processor);
public abstract int forEachByteDesc(ByteProcessor processor);
public abstract int forEachByteDesc(int index, int length, ByteProcessor processor);

/**
 * 返回此缓冲区可读字节的副本。两个ByteBuf内容独立。
 * 类似 buf.copy(buf.readerIndex(), buf.readableBytes());
 * 源ByteBuf的指针都不会被修改
 */
public abstract ByteBuf copy();
public abstract ByteBuf copy(int index, int length);

/**
 * 返回该缓冲区可读字节的一个片段。
 * 修改返回的缓冲区或这个缓冲区的内容会影响彼此的内容，同时它们维护单独的索引和标记。
 * 此方法与 buf.slice (buf.readerIndex () ，buf.readableBytes ()相同。
 * 此方法不修改此缓冲区的 readerIndex 或 writerIndex。
 */
public abstract ByteBuf slice();

/**
 * 与 slice().retain() 行为一样
 */
public abstract ByteBuf retainedSlice();
public abstract ByteBuf slice(int index, int length);
public abstract ByteBuf retainedSlice(int index, int length);

/**
 * 内容共享。各自维护独立的索引的标记。
 * 新的ByteBuf的可读内容是和slice()方法返回的一样。但是由于共享底层的ByteBuf对象，
 * 所以底层的所有内容都是可见的。
 * read和write标志并不是复制的。同时也需要注意此方法并不会调用retain()给引用计数+1
 */
public abstract ByteBuf duplicate();
public abstract ByteBuf retainedDuplicate();

/**
 * 返回组成这个缓冲区的 NIO bytebuffer 的最大数目。一般默认是1，对于组合的ByteBuf则计算总和。
 * 
 * @return -1 表示底层没有ByteBuf
 * @see #nioBuffers(int, int)
 */
public abstract int nioBufferCount();

/**
 * 将该缓冲区的可读字节作为 NIO ByteBuffer 公开。共享内容。
 * buf.nioBuffer(buf.readerIndex(), buf.readableBytes()) 结果一样。
 * 请注意，如果这个缓冲区是一个动态缓冲区并调整了其容量，那么返回的NIO缓冲区将不会看到这些变化
 */
public abstract ByteBuffer nioBuffer();
public abstract ByteBuffer nioBuffer(int index, int length);

/**
 * 仅内部使用: 公开内部 NIO 缓冲区。
 */
public abstract ByteBuffer internalNioBuffer(int index, int length);
public abstract ByteBuffer[] nioBuffers();
public abstract ByteBuffer[] nioBuffers(int index, int length);

/**
 * 如果当前ByteBuf拥有支持数据则返回true
 */
public abstract boolean hasArray();
public abstract byte[] array();

/**
 * 返回此缓冲区的支撑字节数组中第一个字节的偏移量。
 */
public abstract int arrayOffset();

/**
 * 当且仅当此缓冲区具有指向「backing data」的低级内存地址的引用时才返回true
 */
public abstract boolean hasMemoryAddress();
public abstract long memoryAddress();

/**
 * 如果此 ByteBuf 内部为单个内存区域则返回true。复合类型的缓冲区必须返回false，即使只包含一个ByteBuf对象。
 */
public boolean isContiguous() {
    return false;
}

public abstract String toString(Charset charset);

public abstract String toString(int index, int length, Charset charset);

@Override
public abstract int hashCode();

@Override
public abstract boolean equals(Object obj);

@Override
public abstract int compareTo(ByteBuf buffer);

@Override
public abstract String toString();

@Override
public abstract ByteBuf retain(int increment);

@Override
public abstract ByteBuf retain();

@Override
public abstract ByteBuf touch();

@Override
public abstract ByteBuf touch(Object hint);

boolean isAccessible() {
    return refCnt() != 0;
}

getXX() 从源 Buffer 复制数据到目标 Buffer。可能会修改目标 Buffer 的 writerIndex。
setXX() 将目标 Buffer 中的数据复制到源 Buffer。可能会修改目标 Buffer 的 readerIndex。
readXX() 表示从 Buffer 中读取数据，会根据基本类型增长源 Buffer 的 readerIndex。
get 和 set 都是相对于 this 而言，比如 this.getXX() 意味着获取 this.buffer 的信息并复制到目标ByteBuf对象中。而 this.setXX() 表示从目标ByteBuf对象中复制数据到 this.buffer。

AbstractByteBuf

它是 ByteBuf 的基本实现骨架，实现了 io.netty.buffer.ByteBuf 大部分的抽象方法，子类只需根据特定功能实现对应抽象方法即可。在 io.netty.buffer.AbstractByteBuf 抽象类中做了以下事情:

定义并维护 5 个指定变量。分别是 readerIndex 、writerIndex 、markedReaderIndex、markedWriterIndex 和 maxCapacity。因此，此抽象类的主要工作也是维护这 5 个变量。比如在 getXX() 方法前判断一下是否满足等等。
初始化 ResourceLeakDetector<ByteBuf> 内存泄漏检测对象。它记录 Netty 各种的 ByteBuf 使用情况，能对占用资源的对象进行监控，无论是否池化、无论堆外堆内。有 4 种级别可选: DISABLED、SIMPLE、ADVANCED 和 PARANOID。监控级别也由低到高，级别越高，可监控的 ByteBuf 数量越多，可获得的信息也越多，但是性能影响也越大。一般建议在 DEBUG 模式下可使用 ADVANCED 或 PARANOID，生产环境使用 SMPLE。其实实现逻辑也是比较简单的，就是对 ByteBuf 对象进行包装，在执行相关API 时记录必要的数据，然后根据这些数据分析哪里出现了内存泄漏，并通过日志告知用户需要进行排查。

摘录部分 API

// io.netty.buffer.AbstractByteBuf
public abstract class AbstractByteBuf extends ByteBuf {
    static final ResourceLeakDetector<ByteBuf> leakDetector =
        ResourceLeakDetectorFactory.instance().newResourceLeakDetector(ByteBuf.class);
    int readerIndex;
    int writerIndex;
    private int markedReaderIndex;
    private int markedWriterIndex;
    private int maxCapacity;
    
    @Override
    public ByteBuf setByte(int index, int value) {
        checkIndex(index);
        _setByte(index, value);
        return this;
    }

    protected abstract void _setByte(int index, int value);

    @Override
    public byte getByte(int index) {
        checkIndex(index);
        return _getByte(index);
    }

    protected abstract byte _getByte(int index);

    @Override
    public byte readByte() {
        checkReadableBytes0(1);
        int i = readerIndex;
        byte b = _getByte(i);
        readerIndex = i + 1;
        return b;
    }
    // ...
}

AbstractReferenceCountedByteBuf

抽象类，实现与引用计数相关接口，内部使用 ReferenceCountUpdater 对象对变量 refCnt 进行增/减操作，操作 refCnt 的唯一入口就是 updater 对象。内部实现还是比较简洁的，因为所有的操作都委派给 ReferenceCountedByteBuf 对象来完成。
相关源码如下:

// io.netty.buffer.AbstractReferenceCountedByteBuf
public abstract class AbstractReferenceCountedByteBuf extends AbstractByteBuf {
    // 获取变量「refCnt」偏移量，底层通过Unsafe来更改
    private static final long REFCNT_FIELD_OFFSET =
            ReferenceCountUpdater.getUnsafeOffset(AbstractReferenceCountedByteBuf.class, "refCnt");
    private static final AtomicIntegerFieldUpdater<AbstractReferenceCountedByteBuf> AIF_UPDATER =
            AtomicIntegerFieldUpdater.newUpdater(AbstractReferenceCountedByteBuf.class, "refCnt");
	
    // 核心对象，操作变量refCnt的唯一入口
    private static final ReferenceCountUpdater<AbstractReferenceCountedByteBuf> updater =
            new ReferenceCountUpdater<AbstractReferenceCountedByteBuf>() {
        @Override
        protected AtomicIntegerFieldUpdater<AbstractReferenceCountedByteBuf> updater() {
            return AIF_UPDATER;
        }
        @Override
        protected long unsafeOffset() {
            return REFCNT_FIELD_OFFSET;
        }
    };

    // Value might not equal "real" reference count, all access should be via the updater
    @SuppressWarnings("unused")
    private volatile int refCnt = updater.initialValue();
    
    // ...
}

ReferenceCountUpdater

在这里插入图片描述

ReferenceCountUpdater 对实现 ReferenceCounted 接口的 ByteBuf 进行引用计数相关的操作。底层通过魔法类 java.util.concurrent.atomic.AtomicIntegerFieldUpdater 来完成对该值的增/减操作。那有人问，为啥不直接使用 AtomicLong 呢? 别问，问就是提高性能。确实，如果使用 AtomicLong 一个对象的话，它的内存开销会比一个基本变量来得大一些（个人猜想的）。
对于 ReferenceCountUpdater 的逻辑，有以下认知:

每一个刚刚出生的 ByteBuf 对象，其 refCnt 的值为 2 。因此，每当引用计数逻辑 +1，则对应的 refCnt 物理 +2，每当引用计数逻辑 -1，则对应 refCnt 物理 -2。因此，只要存在引用，内部引用计数值就是偶数。则可以通过 refCnt&1 == 0? 来判断是否持有引用。
除了这样判断之外，还有很多地方也可以通过位运算提高性能。虽然不多，但也是极致优化的体现了。
公式 realCount = value>>>1 得到计数引用逻辑值。

/**
 * 释放ByteBuf，refCnt-2
 * 
 * @param instance
 * @return
 */
public final boolean release(T instance) {

    // #1 通过 Unsafe 非原子性获取当前对象的变量 refCnt 的值
    int rawCnt = nonVolatileRawCnt(instance);

    // #2 rawCnt==2():直接将refCnt置为1
    //      tryFinalRelease0: 只尝试一次，通过 CAS 设置refCnt值为1
    //      尝试失败，则 retryRelease0，则在for(;;) 中更新计数引用的值，直到成功为止
    //    rawCnt != 2，表示此次释放并非是彻底释放，
    return rawCnt == 2 ? tryFinalRelease0(instance, 2) || retryRelease0(instance, 1)
            : nonFinalRelease0(instance, 1, rawCnt, toLiveRealRefCnt(rawCnt, 1));
}

// io.netty.util.internal.ReferenceCountUpdater#tryFinalRelease0
private boolean tryFinalRelease0(T instance, int expectRawCnt) {
    // 将refCnt的值从期望值expectRawCnt变成1
    return updater().compareAndSet(instance, expectRawCnt, 1); // any odd number will work
}

// io.netty.util.internal.ReferenceCountUpdater#retryRelease0
/**
 * 尝试释放: 将对象 instance 的refCnt值逻辑-1，物理-2
 */
private boolean retryRelease0(T instance, int decrement) {
    for (;;) {
        // #1 获取refCnt物理值
        // 获取实际的引用数，如果为奇数，则抛出异常，
        // 因为当前 ByteBuf 不存在引用，也就不存在释放这一说法
        int rawCnt = updater().get(instance);
        
        // #2 获取refCnt逻辑值
        int realCnt = toLiveRealRefCnt(rawCnt, decrement);

        // #3 如果减数==realCnt，表示该ByteBuf需要释放，即refCnt=1
        if (decrement == realCnt) {
            if (tryFinalRelease0(instance, rawCnt)) {
                return true;
            }
        } else if (decrement < realCnt) {
            // 如果减数小于实际值，则更新 rawCnt-2
            if (updater().compareAndSet(instance, rawCnt, rawCnt - (decrement << 1))) {
                return false;
            }
        } else {
            // 否则抛出异常
            throw new IllegalReferenceCountException(realCnt, -decrement);
        }
        // 在高并发情况下，这有助于提高吞吐量
        // 线程让步: 担心当前线程对CPU资源占用过多，所以主要让自己从执行状态变为就绪状态，和其他线程竞争上岗
        Thread.yield();
    }
}

public final boolean release(T instance, int decrement) {
    int rawCnt = nonVolatileRawCnt(instance);
    int realCnt = toLiveRealRefCnt(rawCnt, checkPositive(decrement, "decrement"));
    return decrement == realCnt ? tryFinalRelease0(instance, rawCnt) || retryRelease0(instance, decrement)
            : nonFinalRelease0(instance, decrement, rawCnt, realCnt);
}


// io.netty.util.internal.ReferenceCountUpdater#realRefCnt
// 获取真实计数
private static int realRefCnt(int rawCnt) {
    // (rawCnt & 1) != 0 判断是否为偶数，偶数才会有引用存在
    return rawCnt != 2 && rawCnt != 4 && (rawCnt & 1) != 0 ? 0 : rawCnt >>> 1;
}

// io.netty.util.internal.ReferenceCountUpdater#retain0
/**
 * 维持对该ByteBuf对象的引用。
 * 逻辑+1，物理值refCnt+2
 * @param instance       实例
 * @param increment      增加值
 * @param rawIncrement   原始值
 * @return
 */
private T retain0(T instance, final int increment, final int rawIncrement) {
    
    // #1获取旧值并增加引用计数器的原始值
    int oldRef = updater().getAndAdd(instance, rawIncrement);
    
    // #2 校验旧值
    // 若为旧值奇数，则说明当前ByteBuf对象已被释放，无法对已释放的ByteBuf对象维持引用
    if (oldRef != 2 && oldRef != 4 && (oldRef & 1) != 0) {
        throw new IllegalReferenceCountException(0, increment);
    }

    // #3 溢出处理
    // 比如以下情况
    // oldRef =-1173741824，increment=1003741824 rawIncrement=2007483648
    // oldRef =2，          increment=1103741824 rawIncrement=-2087483648
    if ((oldRef <= 0 && oldRef + rawIncrement >= 0)
            || (oldRef >= 0 && oldRef + rawIncrement < oldRef)) {
        // 修正
        updater().getAndAdd(instance, -rawIncrement);
        // 抛出异常
        throw new IllegalReferenceCountException(realRefCnt(oldRef), increment);
    }
    
    // #4 返回
    return instance;
}

ReferenceCountUpdater 是一个抽象类，内部方法对 refCnt 操作的逻辑。最终对 volatinle int refCnt 的赋值等操作是由 AtomicIntegerFieldUpdater 来完成的。但是这个对象不能内存持有，因此提供一个抽象方法由对应使用者赋予:

// io.netty.util.internal.ReferenceCountUpdater
// 获取 AtomicIntegerFieldUpdater 类，该类是 JDK 提供了并发更改数据的类
protected abstract AtomicIntegerFieldUpdater<T> updater();

// refCnt变量的偏移地址
protected abstract long unsafeOffset();

在 io.netty.buffer.AbstractReferenceCountedByteBuf 抽象中是这样实现的:

// io.netty.buffer.AbstractReferenceCountedByteBuf
public abstract class AbstractReferenceCounted implements ReferenceCounted {
    // 
    private static final long REFCNT_FIELD_OFFSET =
            ReferenceCountUpdater.getUnsafeOffset(AbstractReferenceCounted.class, "refCnt");
    private static final AtomicIntegerFieldUpdater<AbstractReferenceCounted> AIF_UPDATER =
            AtomicIntegerFieldUpdater.newUpdater(AbstractReferenceCounted.class, "refCnt");

    private static final ReferenceCountUpdater<AbstractReferenceCounted> updater =
            new ReferenceCountUpdater<AbstractReferenceCounted>() {
        // 获取AtomicIntegerFieldUpdater对象，底层使用此对象对属性操作
        @Override
        protected AtomicIntegerFieldUpdater<AbstractReferenceCounted> updater() {
            return AIF_UPDATER;
        }
        
        // 获取属性变量 refCnt 的偏移量
        @Override
        protected long unsafeOffset() {
            return REFCNT_FIELD_OFFSET;
        }
    };
    // ...
}

看了源码，相信对抽象类 AbstractReferenctCountedByteBuf 是如何对变量 volatile int refCnt 操作的吧!最终底层是通过 AtomicIntegerFieldUpdater 完成。

AtomicIntegerFieldUpdater

并发大神 Doug Lea 编写的一个基于反射的实用程序，可以对指定类的指定 volatile int 字段进行原子更新。此类设计用于原子数据结构，其中同一节点的多个字段独立地进行原子更新。
注意，这个类中 compareAndSet 方法的保证比其他原子类中的保证要弱。因为这个类不能确保字段的所有用途都适合于原子访问，所以它只能保证 compareAndSet 的其他调用的原子性，并在同一个更新程序上进行设置。

关于 Unsafe

Unsafe 的关于对象字段访问的方法把对象布局抽象出来，提供了 sun.misc.Unsafe#objectFieldOffset 方法用于 获取某个字节相对 Java 对象的 “起始地址” 的偏移量，也提供 getInt、getLong、getObject 等方法通过传入某个变量的偏移量来访问它的值。
更多详见 TODO。

AbstractReferenceCountedByteBuf 子类实现

在这里插入图片描述

AbstractReferenceCountedByteBuf 的子类基本包括了我们在 Netty 中最常见的 ByteBuf 实例类型。

细说非池化 ByteBuf

我们先看一下非池化 ByteBuf 相关继承体系:
在这里插入图片描述

非池化ByteBuf会分为两类，分别是

UnpooledHeapByteBuf
UnpooledDirectByteBuf

通过对比

getByte(int);
getBytes(int, ByteBuf, int, int);
setBytes(int, int);
setBytes(int, ByteBuf, int, int);
deallocate();

API 对比两个相同和不同的地方。

UnpooledHeapByteBuf

// io.netty.buffer.UnpooledHeapByteBuf
/**
 * 非池化堆内内存，底层使用 byte[] 数组存储数据
 */
public class UnpooledHeapByteBuf extends AbstractReferenceCountedByteBuf {
	
    // 每个 ByteBuf 对象都会持有一个创建这个ByteBuf的 ByteBufAllocator 分配器的引用
    // 方法 alloc() 会返回当前ByteBuf所对应的分配器
    // 因此，可以从任意一个ByteBuf对象获取内存分配器，可以用来分配内存或做一些判断之类的事情
    private final ByteBufAllocator alloc;
    
    // 数据存储的地方
    byte[] array;
    // 属于Java NIO的ByteBuffer，用过Netty的ByteBuf到ByteBuffer的转换
    private ByteBuffer tmpNioBuf;
    
    /**
     * 获取index下标的值
     * 这个方法覆盖抽象类「AbstractByteBuf」方法，两者有什么区别呢?
     * 「AbstractByteBuf」使用 checkIndex(index) 包含了「refCnt」校验以及索引检查，
     * 而当前对象只包含ensureAccessible()校验
     */
    @Override
    public byte getByte(int index) {
        
        // #1 判断是否进行读取操作
        ensureAccessible();
        
        // #2 读取index的值
        return _getByte(index);
    }

    @Override
    protected byte _getByte(int index) {
        // 使用工具类「HeapByteBufUtil」完成（底层是数组，所以直接array[index]完成）
        return HeapByteBufUtil.getByte(array, index);
    }

    /**
     * 从指定的绝对索引值开始，将长度为 length 的数据传送到指定目的地。
     * 此方法不会修改源ByteBuf的readerIndex、writerIndex指针。
     * 这个API需要交由子类实现
     * @param index		源ByteBuf对象的index的值
     * @param dst		目的ByteBuf对象
     * @param dstIndex	目标ByteBufindex的值
     * @param length	长度
     * @return
     */
    @Override
    public ByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
        
        // #1 检查index、length、capacity、refCnt 是否满足
        checkDstIndex(index, length, dstIndex, dst.capacity());
        
        // #2 根据目标ByteBuf类型不同，使用不同的拷贝策略
        if (dst.hasMemoryAddress()) {
            // #2-1 如果目标ByteBuf类型包含memoryAddress，则与Unsafe相关，
            // 那就通过「Unsafe」来完成数据拷贝工作
            PlatformDependent.copyMemory(array, index, dst.memoryAddress() + dstIndex, length);
        } else if (dst.hasArray()) {
            // #2-2 如果目标类ByteBuf类型包含「支撑数组」，则与 byte[] 字节数组相关，
            // 那就通过本地方法 System.arraycopy() 完成数据拷贝工作
            getBytes(index, dst.array(), dst.arrayOffset() + dstIndex, length);
        } else {
            // #2-3 两者都不是的话，则调用对应 ByteBuf#setBytes 方法完成数据复制
            // 目标ByteBuf会根据自己的实现选择合适的方法拷贝
            dst.setBytes(dstIndex, array, index, length);
        }
        return this;
    }
    
    /**
     * 赋值操作
     */
    @Override
    public ByteBuf setByte(int index, int value) {
        ensureAccessible();
        _setByte(index, value);
        return this;
    }
    
    @Override
    protected void _setByte(int index, int value) {
        // 由于是数组，也是直接通过array[index]=value完成赋值
        HeapByteBufUtil.setByte(array, index, value);
    }
    
    /**
     * 大体上和getBytes(int index, ByteBuf dst, int dstIndex, int length)，只不过是从数据复制方向变了而已
     */
    @Override
    public ByteBuf setBytes(int index, ByteBuf src, int srcIndex, int length) {
        
        // #1 检查index、length、capacity、refCnt 是否满足
        checkSrcIndex(index, length, srcIndex, src.capacity());

        // #2 根据不同的数据源ByteBuf实例，使用不同的拷贝策略
        if (src.hasMemoryAddress()) {
			// #2-1 借助「Unsafe」完成
            PlatformDependent.copyMemory(src.memoryAddress() + srcIndex, array, index, length);
        } else  if (src.hasArray()) {
            // #2-2 借助「System.arraycopy()」完成
            setBytes(index, src.array(), src.arrayOffset() + srcIndex, length);
        } else {
            // #2-3 借助 「ByteBuf#getBytes」方法完成
            src.getBytes(srcIndex, array, index, length);
        }
        return this;
    }
    
    /**
     * 释放ByteBuf对象
     */
    @Override
    protected void deallocate() {
        freeArray(array);
        array = EmptyArrays.EMPTY_BYTES;
    }
    
    // 使用GC进行垃圾回收
    protected void freeArray(byte[] array) {
        // NOOP
    }
}

// io.netty.util.internal.MathUtil#isOutOfBounds
public static boolean isOutOfBounds(int index, int length, int capacity) {
    return (index | length | (index + length) | (capacity - (index + length))) < 0;
}

// io.netty.util.internal.PlatformDependent#copyMemory(byte[], int, long, long)
public static void copyMemory(byte[] src, int srcIndex, long dstAddr, long length) {
    PlatformDependent0.copyMemory(src, BYTE_ARRAY_BASE_OFFSET + srcIndex, null, dstAddr, length);
}

// io.netty.util.internal.PlatformDependent0#copyMemory(java.lang.Object, long, java.lang.Object, long, long)
static void copyMemory(Object src, long srcOffset, Object dst, long dstOffset, long length) {
    // Manual safe-point polling is only needed prior Java9:
    // See https://bugs.openjdk.java.net/browse/JDK-8149596
    // 两个底层都是通过UNSAFE.copyMemory()拷贝数据，但是由于在JDK版本低于1.9会由于安全点导致数据复制不全
    // 因此Netty通过while+length保证完整复制，从而修复该BUG
    if (javaVersion() <= 8) {
        copyMemoryWithSafePointPolling(src, srcOffset, dst, dstOffset, length);
    } else {
        UNSAFE.copyMemory(src, srcOffset, dst, dstOffset, length);
    }
}

// io.netty.util.internal.PlatformDependent0#copyMemoryWithSafePointPolling
private static void copyMemoryWithSafePointPolling(
    Object src, long srcOffset, Object dst, long dstOffset, long length) {
    while (length > 0) {
        long size = Math.min(length, UNSAFE_COPY_THRESHOLD);
        UNSAFE.copyMemory(src, srcOffset, dst, dstOffset, size);
        length -= size;
        srcOffset += size;
        dstOffset += size;
    }
}
 
// io.netty.buffer.HeapByteBufUtil
static void setByte(byte[] memory, int index, int value) {
    memory[index] = (byte) value;
}

UnpooledHeapByteBuf 底层是数组，根据目标 ByteBuf 对象的不同，选取不同的策略完成读取/写入等操作。由于底层是由数据来存储数据，所以一般实现相对比较简单。

子类实现

在这里插入图片描述

UnpooledHeapByteBuf 有两个子类实现:

UnpooledUnsafeHeapByteBuf: 使用 UnsafeByteBufUtil 管理 byte[] 数组。它也有一个子类
- InstrumentedUnpooledUnsafeHeapByteBuf: 属于 UnpooledByteBufAllocator 私有的类，
InstrumentedUnpooledHeapByteBuf : Instrumented 表示增加了某些装置的，其实就是在内存分配、释放后修改已分配内存大小而已，用作内存监控管理。带有 Instrumented 前缀的总共有 5 个类，它们都是 UnpooledByteBufAllocatory 分配器的私有的类。一般地，创建 ByteBuf 对象是通过分配器来创建的，好处之一是统一入口，这样我们就可以做一些数据记录: 已分配内存大小。当分配内存时则加上申请的内存大小的值，释放内存时则减去相应的归还内存大小的值。对应的 5 个类如下所示:
- InstrumentedUnpooledUnsafeHeapByteBuf
- InstrumentedUnpooledHeapByteBuf
- InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf
- InstrumentedUnpooledUnsafeDirectByteBuf
- InstrumentedUnpooledDirectByteBuf

UnpooledDirectByteBuf

// io.netty.buffer.UnpooledDirectByteBuf
/**
 * 非池化堆外内存，基于「java.nio.ByteBuffer」实现。
 * 推荐使用「UnpooledByteBufAllocator.directBuffer(int, int)」
 * 		  「Unpooled.directBuffer(int)」
 *		  「Unpooled.wrappedBuffer(ByteBuffer)」来显示分配ByteBuf对象，而非直接使用构造器创建 ByteBuf对象
 */
public class UnpooledDirectByteBuf extends AbstractReferenceCountedByteBuf {

    private final ByteBufAllocator alloc;

    ByteBuffer buffer; // accessed by UnpooledUnsafeNoCleanerDirectByteBuf.reallocateDirect()
    private ByteBuffer tmpNioBuf;
    private int capacity;
    private boolean doNotFree;
    
    @Override
    public byte getByte(int index) {
        ensureAccessible();
        return _getByte(index);
    }

    @Override
    protected byte _getByte(int index) {
        return buffer.get(index);
    }
    
    @Override
    public short getShort(int index) {
        ensureAccessible();
        return _getShort(index);
    }

    @Override
    protected short _getShort(int index) {
        return buffer.getShort(index);
    }
    
    /**
     * 将this.bytebuf中的数据复制到目标对象「dst」中
     */
    @Override
    public ByteBuf getBytes(int index, ByteBuf dst, int dstIndex, int length) {
        
        // #1 国际惯例，先严格检查
        checkDstIndex(index, length, dstIndex, dst.capacity());
        
        // #2 根据目标对象实现不同，采取不同复制策略
        if (dst.hasArray()) {
            // #2-1 如果目标ByteBuf持有「支撑数组」，则最后交给ByteBuffer#get()方法完成数据拷贝，
            // 它的底层也是通过for循环挨个复制。
            getBytes(index, dst.array(), dst.arrayOffset() + dstIndex, length);
        } else if (dst.nioBufferCount() > 0) {
            // #2-2 由ByteBuffer组成，挨个复制数据
            for (ByteBuffer bb: dst.nioBuffers(dstIndex, length)) {
                int bbLen = bb.remaining();
                getBytes(index, bb);
                index += bbLen;
            }
        } else {
            // #2-3 反过来，调用dst#setBytes方法复制数据
            dst.setBytes(dstIndex, this, index, length);
        }
        return this;
    }
    
    @Override
    public ByteBuf setByte(int index, int value) {
        ensureAccessible();
        _setByte(index, value);
        return this;
    }

    @Override
    protected void _setByte(int index, int value) {
        // 交给ByteBuffer完成
        buffer.put(index, (byte) value);
    }
    
    @Override
    public ByteBuf setShort(int index, int value) {
        ensureAccessible();
        _setShort(index, value);
        return this;
    }

    @Override
    protected void _setShort(int index, int value) {
        buffer.putShort(index, (short) value);
    }
    
    /**
     * 底层尝试使用Cleaner对象完成ByteBuffer对象的清理
     */
    @Override
    protected void deallocate() {
        ByteBuffer buffer = this.buffer;
        if (buffer == null) {
            return;
        }

        this.buffer = null;

        if (!doNotFree) {
            freeDirect(buffer);
        }
    }
    
    protected void freeDirect(ByteBuffer buffer) {
        PlatformDependent.freeDirectBuffer(buffer);
    }   
}

其实，对比 UnpooledHeapByteBuf 和 UnpooledDirectByteBuf 来看，一个是对 byte[] 封装了一系列 API ，另一个是对 java.nio.ByteBuffer 封装了一系列 API。
我们都知道，java.nio.ByteBuffer 主要特点是用于底层 I/O 操作时速度比较快，它减少了一次内存拷贝，但如果用于内存计算（比如查找等字节操作）那么性能反而不如使用底层为 byte[] 的 ByteBuf 。因此，我们需要根据实际情况选择不同类型的 ByteBuf 实例，追求极致性能。

子类实现

在这里插入图片描述

从上图可以看出，非池化直接内存 ByteBuf 还可以细分很多子类，比如有:

UnpooledUnsafeDirectByteBuf: 非池化的、使用魔法类 Unsafe 的、堆外内存的 ByteBuf。其中有一个很关键的参数 memoryAddress，看见它就想到 Unsafe 类。因此，该对象的分配和释放也是和 Unsafe 息息相关。
- UnpooledUnsafeNoCleanerDirectByteBuf: 无 Cleaner 回收器。在调用 ByteBuf#release() 方法释放内存时，底层通过 Unsafe#freeMemory(long address) 完成堆外内存的释放。
  - InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf: 前面述说过，主要用于 UnpooledByteBufAllocator 分配器分配 ByteBuf 对象。
- ThreadLocalUnsafeDirectByteBuf: ByteBufUtil 内存类。用轻量级对象缓存池提高 ByteBuf 分配效率。
InstrumentedUnpooledDirectByteBuf: UnpooledByteBufAllocator 内部类，主要在分配内存的逻辑上添加了使用内存容量计数。
ThreadLocalDirectByteBuf: ByteBufUtil 内部类。使用轻量级对象缓存池提高 ByteBuf 分配效率。

UnpooledDirectByteBuf 是对 java.nio.ByteBuffer 对象的封装，但对于追求极致性能的 Netty 来说，这么少怎么够呢?于是派生出了两种不同理念的 DirectByteBuf: UnpooledUnsafeDirectByteBuf 和 ThreadLocalDirectByteBuf。前者有一个特殊的变量 long address 是给 Unsafe 用的，通过 Unsafe 完成数据读/写操作。而后者是使用轻量级对象缓存池提高 ByteBuf 分配效率，至于数据的读取操作还是继承父类来完成。
看了这么多 UnpooledByteBuf，相信大家对非池化的 ByteBuf 有一定的了解。可以从名称知道这个 ByteBuf 对象有什么，主要抓住:

使用哪种数据类型/对象存储数据。堆内内存使用 byte[] ，堆外内存使用 java.nio.ByteBuffer 对象。
对于堆外内存有两种回收方式，分别是 Cleaner 和 Unsafe（当然 Cleaner 也是通过 Unsafe 释放内存，但 Cleaner 是 JDK 提供堆外内存回收的另一种方式）。
使用轻量级对象缓存池提高内存分配效率。这个后续会详细讲解。这样，分配和回收动作会由缓冲池来完成。

PooledByteBuf

前面分析了一大堆非池化的ByteBuf，实现逻辑并不复杂。接下来要讲的池化的ByteBuf可要复杂得多。首先我们先了解了解 PooledByteBuf。PooledByteBuf 是一个抽象类，是子类实现池化能力的骨架，定义了与池化相关的属性和变量。在 Netty 4.1.44 版本之前（包括）采用 jemalloc3.x 算法思想，而后面则采用 jemalloc4.x 算法思想进行重构。本篇文章使用 Netty 4.1.44 源码，对于内存分配的源码细节并不做详细说明。如果有兴趣的话，请看大屏幕（TODO）。

// io.netty.buffer.PooledByteBuf
/**

 */
abstract class PooledByteBuf<T> extends AbstractReferenceCountedByteBuf {
    
    // 当前ByteBuf所属分配器
    private ByteBufAllocator allocator;

	// 对象回收器
    private final Handle<PooledByteBuf<T>> recyclerHandle;
	
    // T是泛型，可以为byte[]或ByteBuffer
    protected T memory;
    
    // 该ByteBuf实例所属的PoolChunk
    protected PoolChunk<T> chunk;
    
    // 内存句柄，64位可分为上下两部分，各为32位，分别表示不同的含义，
    // 主要是定位当前ByteBuf对象包含的内存位置
    protected long handle;
    
    // 偏移量，起始值为0（单位: Byte），用于数据容器为「byte[]」
    // 首先，我们需要知道，对于非Huge级别的内存，Netty向JVM一次性申请的内存容量大小为16MB(16777216)。
    // 这块内存可以通过两种方式寻址。对于「byte[]」就是偏移量offset，而对于直接内存则使用「memoryAddress」+ index。
    // 这个offset是针对「byte[]」所使用的变量。比如当创建PooledHeapByteBuf，Netty会在一个长度为16777216的字节
    // 数组中选取合适大小的一段长度分配给当前ByteBuf。而这个「offset」偏移量就是相对数组下标为0的偏移量。
    // 后续向该数组写入数据时只需要通过 offset+index 就可以定位该ByteBuf所分配的内存区域。
    // 简单的说，底层的byte[]（16MB）是大家共享的，通过偏移量来表示起始位置。
    protected int offset;
    // 申请内存的大小
    protected int length;
    // 最大长度
    int maxLength;
    
    // 本地缓存
    PoolThreadCache cache;
    
    // 临时的ByteBuffer
    ByteBuffer tmpNioBuf;
    
    /**
     * 这是一个非常重要的内存释放代码
     */
    @Override
    protected final void deallocate() {
        // #1 判断句柄变量是否>0
        if (handle >= 0) {
            final long handle = this.handle;
            this.handle = -1;
            memory = null;
            // 通过Arena完成释放动作（内存池）
            chunk.arena.free(chunk, tmpNioBuf, handle, maxLength, cache);
            tmpNioBuf = null;
            chunk = null;
            // 回收ByteBuf对象（对象池）
            recycle();
        }
    }

    private void recycle() {
        recyclerHandle.recycle(this);
    }
    
    // ...

}

简单做一个小结:

Netty 推荐使用分配器而非构造器实例化 ByteBuf 对象。一般来说每个 ByteBuf 有属于自己的ByteBufAllocator 对象。
每个 ByteBuf 持有 PoolChunk 对象引用，该对象持有 Arena 引用，可利用 Arena 管理内存分配。这个后续再详说。
ByteBuf 是某段物理内存的类的表示形式。我们使用变量 handle 表示这块内存的具体位置。
每个 ByteBuf 持有对象回收器 recyclerHandler 的引用，当 ByteBuf 所分配的内存被回收后，通过 recyclerHandler 对该 ByteBuf 对象进行回收并重新使用。
由于底层数据存储有两种表现形式: 分别是 byte[] 和 java.nio.ByteBuffer 对象。为了方便，使用泛型（ T ）表示，并没有根据不同的底层类型使用单独的类加以区分。
本地线程池 PoolThreadCache 的加入进一步提高内存分配的效率。

PooledByteBuf 继承关系图

在这里插入图片描述

PooledHeapByteBuf: 可池化的堆内内存ByteBuf。
- PooledUnsafeHeapByteBuf: 使用Unsafe完成读/写操作。
PooledDirectByteBuf: 可池化的堆外内存ByteBuf。
PooledUnsafeDirectByteBuf: 可池化的使用Unsafe的堆外内存ByteBuf。

这几个 ByteBuf 的区别和名字上所表示的意思是一样的。我们并没有详细了解每个 PooleByteBuf 子类的实现，无非也就是通过 Unsafe 或 java.nio.ByteBuffer 进行操作。

ByteBuf 小结

我们只沿着一条主要 ReferenceCounted->ByteBuf->AbstractByteBuf->AbstractReferenceCountedByteBuf->PooledByteBuf/UnpooledDirectByteBuf/UnpooledHeapByteBuf 讲解。其他类型的 ByteBuf 自己感兴趣可以看看，原理并不复杂。
ByteBuf 是对物理存储区域的类的抽象。它按两种维度进行分类: Unpooled 和 Pooled、Heap 和 DirectByteBuffer。两种维度相互重叠互相结合。可组合成:
- UnpooledHeapByteBuf
- UnpooledDirectByteBuf
- PooledHeapByteBuf
- PooledDirectByteBuf
- 还存在相对比较重要的一些 ByteBuf，如 CompositeByteBuf，它可以提高编程效率。
Netty 推荐使用分配器创建 ByteBuf 对象，因此也衍生出了带有 Instrument 前缀命名的 ByteBuf，主要目的是追踪分配的 ByteBuf 生命周期，可以向用户提供更多关于内存分配的详情，帮助用户更好管理内存。
引用计数是减少内存泄漏的关键手段之一。
根据实际情况选择合适的 ByteBuf，熟悉各类 ByetBuf 实例的优劣势:
- 堆外内存
  - 优点
    - 减少一次内存拷贝
    - 降低 GC 压力
    - 实现进程之间、JVM 多实例之间的数据共享
    - 适合大内存分配场景
  - 缺点
    - 需手动释放，稍有不慎会造成堆外内存泄漏，出现问题排查困难
- 堆内内存
  - 优点
    - 可在没有池化情况下提供快速分配和释放内存的能力
    - 内存的释放交由 JVM 管理。用户不需要操心
    - 适合小内存分配场景
  - 缺点
    - 当进行网络 I/O 操作、文件读写时，堆内内存都需要转换为堆外内存，然后再与底层设备进行交互
- 池化
  - 优点
    - 提高内存分配速度、提高内存利用率
  - 缺点
    - 在引用计数还没有成熟之前，Netty 默认分配非池化的 ByteBuf，但随着各种监控的成熟，Netty 4.1 默认分配池化的 ByteBuf
    - 管理内存需要花费一定的开销
    - 可能会造成内存泄漏
- 非池化
  - 优点
    - 适用于小内存分配，快速分配和快速释放
  - 缺点
    - 没有缓冲

以上的总结仅仅是 ByteBuf 的冰山一角。

ByteBuf 分配

按需分配 ByteBufAllocator

ByteBufAllocator 继承结构图
在这里插入图片描述

其实只有池化和非池化两种 ByteBufAllocator，而以 Preferred 只不过是为了方便使用，对 AbstractByteBufAllocator 子类进一步封装了部分 API 而已。所以我们关注点在 PoolBytesBufAllocator 和 UnpooledByteBufAllocator 两个子类实现上。

ByteBufAllocator API

在这里插入图片描述

AbstractByteBufAllocator

抽象类 AbstractByteBufAllocator 实现了 ByteBufAllocator 所有的接口，它是 ByteBufAllocator 的骨架。我们知道，使用 Allocator 是为了更好地管理 ByteBuf 对象的分配，可以判断分配的内存容量是否超标、跟踪已分配的 ByteBuf 并判断是否存在内存泄漏问题。因此，抽象类 AbstractByteBufAllocator 内部有两个方法分别包装 ByteBuf 和 CompositeByteBuf 对象，用于检测内存泄漏。抽象类并没有定义太多的变量，不过有一个比较重要的 boolean 类型变量 directDefault ，它控制着 buffer() API 所返回的对象是否为堆内内存还是堆外内存。相关源码如下:

public abstract class AbstractByteBufAllocator implements ByteBufAllocator {
    // 默认初始容量
    static final int DEFAULT_INITIAL_CAPACITY = 256;
    // 默认最大容量
    static final int DEFAULT_MAX_CAPACITY = Integer.MAX_VALUE;
    // 默认最多组合ByteBuf
    static final int DEFAULT_MAX_COMPONENTS = 16;
    // 当需要扩容很操作时需要进行新容量计算，以CALCULATE_THRESHOLD大小进行增长
    // 而非粗暴*2
    static final int CALCULATE_THRESHOLD = 1048576 * 4; // 4 MiB page

    static {
        ResourceLeakDetector.addExclusions(AbstractByteBufAllocator.class, "toLeakAwareBuffer");
    }
	
    /**
     * 追踪ByteBuf对象，判断是否发生内存泄漏
     * 对于SIMPLE级别，使用SimpleLeakAwareByteBuf包装ByteBuf
     * 对于ADVANCED、PARANOID级别，使用AdvancedLeakAwareByteBuf包装ByteBuf
     * 也就是通过包装类，当调用ByteBuf相关API时，包装类会根据动作的不同记录数据，
     * 比如 release() 动作会执行 leak.record();函数，可以理解这个函数记录当前ByteBuf的使用情况，
     * 因此，通过回溯记录就可以判断哪些ByteBuf对象存在内存泄漏
     */
    protected static ByteBuf toLeakAwareBuffer(ByteBuf buf) {
        ResourceLeakTracker<ByteBuf> leak;
        switch (ResourceLeakDetector.getLevel()) {
            case SIMPLE:
                leak = AbstractByteBuf.leakDetector.track(buf);
                if (leak != null) {
                    buf = new SimpleLeakAwareByteBuf(buf, leak);
                }
                break;
            case ADVANCED:
            case PARANOID:
                leak = AbstractByteBuf.leakDetector.track(buf);
                if (leak != null) {
                    buf = new AdvancedLeakAwareByteBuf(buf, leak);
                }
                break;
            default:
                break;
        }
        return buf;
    }
    private final boolean directByDefault;
    private final ByteBuf emptyBuf;
    
    protected AbstractByteBufAllocator(boolean preferDirect) {
        // 由铁氧磁带preferDirect和平台是否支持Unsafe决定
        directByDefault = preferDirect && PlatformDependent.hasUnsafe();
        emptyBuf = new EmptyByteBuf(this);
    }
    
    @Override
    public ByteBuf buffer() {
        if (directByDefault) {
            return directBuffer();
        }
        return heapBuffer();
    }
    
	// 这是子类需要实现的抽象方法，返回堆内内存和堆外内存的ByteBuf
    protected abstract ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity);
    protected abstract ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity);
 
    // ...
}

PooledByteBufAllocator（4.1.44）

初始化阶段

池化分配器 PooledByteBufAllocator 初始化分为两个阶段，分别是静态代码块和构造器。静态代码块初始化需要用的默认参数，主要有:
在这里插入图片描述

PooledByteBufAllocator 构造器并不复杂，它主要的功能有以下几点:

值校验。判断设定值是否越界、状态是否异常等。
初始化 PoolThreadLocalCache。
初始化 Arena。包括 DirectArena 和 HeapArena。Arena 是 jemalloc 算法思想重要的概念，一个 PooledByteBufAllocator 对象拥有多个 Arena，目的是减少资源竞争，在多线程环境下提高内存分配效率。关于 Arena 详细讲解请移步 TODO。
配置监控。

接口实现

PooledByteBufAllocator 会实现两个抽象方法，这两个方法与创建 ByteBuf 对象密切相关:
在这里插入图片描述

小结（PooledByteBufAllocator）

其实 PooledByteBufAllocator 做的事情并不多。主要有

初始化
- 静态参数初始化（包含校验）。如 DEFAULT_PAGE_SIZE(页大小)、DEFALUT_MAX_ORDER(树高度)等。在静态代码块完成。
- 实例化 heapArenas 和 directArenas 两个数组。Arena 与内存分配有关，Allocator 将内存分配委托给相应的 Arenas 完成。
- 各类的 size。如 smallCacheSize、normalCacheSize、chunkSize 。
- 分配器监控。List<PoolArenaMetric。
- 本地线程缓存 PoolThreadLocalCache。
内存分配是委托 Arena 对象完成。
有一个重要的内部类 PoolThreadLocalCache，它属于本地线程缓存，用于提高内存分配效率。

UnpooledByteBufAllocator（4.1.44）

UnpooledByteBufAllocator 相对 PooledByteBufAllocator 简单，没有复杂的内存管理变量和逻辑。内部有 5 个以 Instrumented 前缀开头的内部类，前面已经对它们进行分析过，这里不再赘述。UnpooledByteBufAllocator 到底选择哪个 ByteBuf 是根据:

平台是否支持 Unsafe。
有无 Cleaner。

我们直接看源码:

// io.netty.buffer.UnpooledByteBufAllocator
public final class UnpooledByteBufAllocator 
    extends AbstractByteBufAllocator implements ByteBufAllocatorMetricProvider {

    private final UnpooledByteBufAllocatorMetric metric = new UnpooledByteBufAllocatorMetric();
    private final boolean disableLeakDetector;

    private final boolean noCleaner;
    
        public UnpooledByteBufAllocator(boolean preferDirect, boolean disableLeakDetector, boolean tryNoCleaner) {
        super(preferDirect);
        this.disableLeakDetector = disableLeakDetector;
        // 初始化「noCleaner」变量
        noCleaner = tryNoCleaner && PlatformDependent.hasUnsafe()
                && PlatformDependent.hasDirectBufferNoCleanerConstructor();
    }

    /**
     * 获取一个非池化的堆内内存「ByteBuf」实例
     */
    @Override
    protected ByteBuf newHeapBuffer(int initialCapacity, int maxCapacity) {
        // 根据平台是否支持Unsafe而创建不同类型的ByteBuf对象
        return PlatformDependent.hasUnsafe() ?
                new InstrumentedUnpooledUnsafeHeapByteBuf(this, initialCapacity, maxCapacity) :
                new InstrumentedUnpooledHeapByteBuf(this, initialCapacity, maxCapacity);
    }

    /**
     * 获取一个非池化的堆外内存「ByteBuf」实例
     */
    @Override
    protected ByteBuf newDirectBuffer(int initialCapacity, int maxCapacity) {
        final ByteBuf buf;
        // 直接内存除了判断平台是否支持Unsafe外，还判断有无Cleaner
        if (PlatformDependent.hasUnsafe()) {
            buf = noCleaner ? new InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf(this, initialCapacity, maxCapacity) :
                    new InstrumentedUnpooledUnsafeDirectByteBuf(this, initialCapacity, maxCapacity);
        } else {
            buf = new InstrumentedUnpooledDirectByteBuf(this, initialCapacity, maxCapacity);
        }
        return disableLeakDetector ? buf : toLeakAwareBuffer(buf);
    }
}

从源码中可以看到，有两个条件可以左右 UnpooledByteBufAllocator 分配策略。分别是Unsafe和 noCleaner，最后都是返回不同类型的 ByteBuf 实现类。
在这里插入图片描述

Unpooled

Unpooled 可以方便创建一个非池化的 ByteBuf 实例，可以把它看成是一个工具类。内部持有一个 UnpooledByteBufAllocator 对象用来分配内存。

在这里插入图片描述

Netty 的零拷贝思想之一就是创建视图方式实现。我只需要管理独立的指针，而不需要把底层数据复制一遍，减少内存副本的数量。因此可通过包装方式创建一个视图。但注意，包装对象修改底层数据也是对源 ByteBuf 对象可见。如果你想避免这种情况发生，可通过拷贝实现。

ByteBufUtil

ByteBufUtil 提供了用于操作 ByteBuf 的静态的辅助方法。

方法	描述
hexdump()	以十六进制的表示形式打印ByteBuf的内容
equals(ByteBuf, ByteBuf)	用来判断两个ByteBuf实例的相等性

再谈引用计数

引用计数背后的思想并不复杂，它主要涉及跟踪到某个特定对象的活动引用的数量。一个 ReferenceCounted 实现的实例将通常以活动的引用计数为 1 作为开始。只要引用计数大于 0，就能保证对象不会被释放。当活动引用的数量减少到 0 时，该实例就会被释放。

注意: 释放确切语义可能是特定于实现的，比如可能立即回收内存，或稍后回收内存，但至少已经释放的对象不可用这个语言是明确的。

引用计数 对于池化来说是至头重要的，它降低了内存分配的开销。

Channel channel = ...;
ByteBufAllocator allocator = channel.alloc();
// ...
ByteBuf buffer = allocator.directBuffer();
int count = buffer.refCnt();

// 释放对象
boolean released = buffer.release();

总结

关于 Netty 的 ByteBuf 体系情况也就讲解到这里，有很多地方也没有讲到，比如 CompositeByteBuf ，不过实现原理并不复杂。希望大家能通过这篇文章对整个 ByteBuf 有一个从体系上的认知。这对开发一个高性能的 Netty 大有裨益。
Netty 通过引入 引用计数方式来优化内存使用和性能，通过阅读源码，了解到 Netty 对性能优化的极致追求。引用计数 思想其实并不复杂，Netty 的源码实现也非常高效，需要加强的是关于并发这一块的理解。相信对并发了解深入点的时候再回看这篇文章，应该会有更深刻的理解吧。现在只通过源码知道作者这么写了，但是还不清楚如何演化而来，就像读书只告诉你一个结论，但是推导过程却不给出。

我的公众号

在这里插入图片描述