okio读写流源码详解（第二篇（缓存BufferedSink 读入流程详解））

最新推荐文章于 2025-08-03 10:44:33 发布

飞雨的夏天

最新推荐文章于 2025-08-03 10:44:33 发布

阅读量3.4k

点赞数 1

CC 4.0 BY-SA版权

分类专栏： java/io流文章标签：源码 io java buffer 缓存

本文链接：https://blog.youkuaiyun.com/xiatiandefeiyu/article/details/78004051

java/io流专栏收录该内容

4 篇文章

订阅专栏

本文详细剖析了Okio库的设计模式及其核心组件RealBufferedSource的工作原理，包括数据读取流程、缓存机制及享元模式的应用。

okio采用的最主要的设计模式就是装饰者模式、享元模式、模板模式，应该说模板模式充斥着所有的代码之中，不管是谁写的代码。

Source source = Okio.source(new File(filePath));
			BufferedSource buffer = Okio.buffer(source);
			String h1=buffer.readUtf8Line();
			String w=buffer.readUtf8LineStrict(60);
			System.out.println(w);
			System.out.println(h1);
		    String h = buffer.readString(Charset.forName("UTF-8"));
            System.out.println(h);

点进Okio.buffer

  public static BufferedSource buffer(Source source) {
    return new RealBufferedSource(source);
  }

真正实现缓冲流的是RealBufferedSource类，每个实现Source接口的类都会持有 public final Buffer buffer = new Buffer();这个类，真正实现缓存的也是这个类，Buffer这个类持有链表 Segment head;头，当链表中的Segment 储存的数据满了的话，添加新的Segment ，也就是说buffer持有链表头的引用，而链表中保存的Segment对象才是真正保存字节的存储对象，每个Segment最大保存的量是 static final int SIZE = 8192，8192个字节，大于这个字节就去创建新的Segment，或去SegmentPool链接池中获取Segment使用，而SegmentPool就是为了保存Segment而存在的，避免短时间内大量对象的创建和回收，减少gc的次数，增加程序的效率，这就是享元模式，这也就是看源码的好处，知道大神是怎么优化的，进而我们模仿，最终变成我们的知识。

Segment最终把数据保存到byte[]数组中，其实java api给我们提供的那些集合，归根揭底都是用数组实现的有木有，而每次Segment满了，再写入另个Segment，这减少了数组的复制和扩容，真是吊吊吊。

接下里看RealBufferedSource的读

  @Override public String readString(Charset charset) throws IOException {
    if (charset == null) throw new IllegalArgumentException("charset == null");

    buffer.writeAll(source);
    return buffer.readString(charset);
  }

读取字符串必须指定Charset

进入 buffer.writeAll(source)，把被包装的类传给Buffer类

/**
 * 计算总共要读取多少字节，并把字节写到链表缓存中
 */
   public long writeAll(Source source) throws IOException {
    if (source == null) throw new IllegalArgumentException("source == null");
    long totalBytesRead = 0;
    for (long readCount; (readCount = source.read(this, Segment.SIZE)) != -1; ) {
      totalBytesRead += readCount;
    }
    return totalBytesRead;
  }

这个方法的意思就是将文件中所有的字节全部读到Buffer的Segment链表中,看source.read(this, Segment.SIZE))，是在Okio这个类中实现的

  return new Source() {
      @Override public long read(Buffer sink, long byteCount) throws IOException {
        if (byteCount < 0) throw new IllegalArgumentException("byteCount < 0: " + byteCount);
        if (byteCount == 0) return 0;
        try {
          timeout.throwIfReached();
          Segment tail = sink.writableSegment(1);
          //每次copy的最大量不得大于当前Segment还能容纳的量
          int maxToCopy = (int) Math.min(byteCount, Segment.SIZE - tail.limit);
          int bytesRead = in.read(tail.data, tail.limit, maxToCopy);
          if (bytesRead == -1) return -1;
          tail.limit += bytesRead;
          sink.size += bytesRead;
          return bytesRead;
        } catch (AssertionError e) {
          if (isAndroidGetsocknameError(e)) throw new IOException(e);
          throw e;
        }
      }

      @Override public void close() throws IOException {
        in.close();
      }

      @Override public Timeout timeout() {
        return timeout;
      }

      @Override public String toString() {
        return "source(" + in + ")";
      }
    };
  }

而in就是FileInputStream

读完所有的数据之后， buffer.readString(charset)，开始从缓存中获取数据

 public String readUtf8() {
    try {
      return readString(size, Util.UTF_8);
    } catch (EOFException e) {
      throw new AssertionError(e);
    }
  }

size是用来记录总够读了多少字节

public String readString(long byteCount, Charset charset) throws EOFException {
	   //检查size和byteCount是否合法
    checkOffsetAndCount(size, 0, byteCount);
    if (charset == null) throw new IllegalArgumentException("charset == null");
    if (byteCount > Integer.MAX_VALUE) {
      throw new IllegalArgumentException("byteCount > Integer.MAX_VALUE: " + byteCount);
    }
    if (byteCount == 0) return "";

    Segment s = head;
    //假如head已经存储了所有的数据则直接将head、数据取出，否则循环读取
    if (s.pos + byteCount > s.limit) {
     //readByteArray循环读取    	
      return new String(readByteArray(byteCount), charset);
    }
     //直接取出
    String result = new String(s.data, s.pos, (int) byteCount, charset);
    //重新设置pos的位置，没取之前为0，取了多少等于多少
    s.pos += byteCount;
    //当前的size的大小等于还剩多少
    size -= byteCount;
    //假如一个Segment用完了，则将Segment从链表中脱离出来，并放回SegmentPool池里（如果缓冲池还能放的下的话）
    if (s.pos == s.limit) {
      head = s.pop();
      //放在回收池里面
      SegmentPool.recycle(s);
    }

    return result;
  }

进入readByteArray方法，看一下怎么将链表全部读出的

 public byte[] readByteArray(long byteCount) throws EOFException {
    checkOffsetAndCount(size, 0, byteCount);
    if (byteCount > Integer.MAX_VALUE) {
      throw new IllegalArgumentException("byteCount > Integer.MAX_VALUE: " + byteCount);
    }
   //创建一个这么大的字节数组
    byte[] result = new byte[(int) byteCount];
    //最终进入它
    readFully(result);
    return result;
  }

最终实现读所有数据的方法readFully(result);方法，

 /**
    * 将缓存链表里的数据全部读取
    */
   public void readFully(byte[] sink) throws EOFException {
    int offset = 0;
    while (offset < sink.length) {
      int read = read(sink, offset, sink.length - offset);
      if (read == -1) throw new EOFException();
      offset += read;
    }
  }

直到从链表中读满sink为止，

/**
 * 每次copy一定的数量到字节数组中
 */
   public int read(byte[] sink, int offset, int byteCount) {
    checkOffsetAndCount(sink.length, offset, byteCount);

    Segment s = head;
    if (s == null) return -1;
    int toCopy = Math.min(byteCount, s.limit - s.pos);
    System.arraycopy(s.data, s.pos, sink, offset, toCopy);

    s.pos += toCopy;
    size -= toCopy;

    if (s.pos == s.limit) {
      head = s.pop();
      SegmentPool.recycle(s);
    }

    return toCopy;
  }

每读完一个Segment 回收一次,读完之后最终就将字节数组变成String返回了，至此readString流程结束.

接下来看一下读取一行的逻辑，这个逻辑相对复杂一些，值得一看，毕竟涉及的基础比较多,好，从buffer.readUtf8Line()方法进入

 @Override public  String readUtf8Line() throws IOException {
	  //记录到第一个\n出现的是多少字节
    long newline = indexOf((byte) '\n');
    //如果没有换行的就读出所有数据
    if (newline == -1) {
      return buffer.size != 0 ? readUtf8(buffer.size) : null;
    }

    return buffer.readUtf8Line(newline);
  }

这个方法最重要的就是indexOf((byte) '\n')，这个方法，检查链表中第一个出现\n字符总共跳过了多少字节，然后readUtf8Line(newline)直接取出多少字节就好了，和上面流程又一样了

/**
 * 查询链表中是否有\n字符
 */
  @Override public long indexOf(byte b, long fromIndex, long toIndex) throws IOException {
    if (closed) throw new IllegalStateException("closed");
    //判断索引的合法性
    if (fromIndex < 0 || toIndex < fromIndex) {
      throw new IllegalArgumentException(
          String.format("fromIndex=%s toIndex=%s", fromIndex, toIndex));
    }

    while (fromIndex < toIndex) {
      long result = buffer.indexOf(b, fromIndex, toIndex);
      if (result != -1L) return result;

      // The byte wasn't in the buffer. Give up if we've already reached our target size or if the
      // underlying stream is exhausted.
      long lastBufferSize = buffer.size;
      //假如输入的索引小于buffer的总size，或者文件可以继续读的话,刚开始，还没有读入缓存,所以以走source.read
      if (lastBufferSize >= toIndex || source.read(buffer, Segment.SIZE) == -1) return -1L;

      // Continue the search from where we left off.
      //前索引等于fromIndex, lastBufferSize的最大值
      fromIndex = Math.max(fromIndex, lastBufferSize);
    }
    return -1L;
  }

最后查询\n字符的方法就是在buffer里面实现的buffer.indexOf(b, fromIndex, toIndex);

 public long indexOf(byte b, long fromIndex, long toIndex) {
    if (fromIndex < 0 || toIndex < fromIndex) {
      throw new IllegalArgumentException(
          String.format("size=%s fromIndex=%s toIndex=%s", size, fromIndex, toIndex));
    }
    //toIndex最大不能大于当前buffer已经读取的数据量
    if (toIndex > size) toIndex = size;
    if (fromIndex == toIndex) return -1L;

    Segment s;
    long offset;

    // TODO(jwilson): extract this to a shared helper method when can do so without allocating.
    findSegmentAndOffset: {
      // Pick the first segment to scan. This is the first segment with offset <= fromIndex.
      s = head;
      if (s == null) {
        // No segments to scan!
        return -1L;
      }
      //假如已经假如内存的字节的size<开始值的2倍那么从链表尾部开始找，这样节约时间,确定初始值在前半段还是后半段
      else if (size - fromIndex < fromIndex) {
        // We're scanning in the back half of this buffer. Find the segment starting at the back.
        offset = size;
        while (offset > fromIndex) {
          //s如果是链表头的话，它的上一个元素就是链表的尾部,从后朝前找
          s = s.prev;
          offset -= (s.limit - s.pos);
        }
      }
      else {
        // We're scanning in the front half of this buffer. Find the segment starting at the front.
        offset = 0L;
        //判断fromIndex是否在当前的Segment范围内，不在的话累加,找到匹配范围的那个Segment
        for (long nextOffset; (nextOffset = offset + (s.limit - s.pos)) < fromIndex; ) {
          s = s.next;
          offset = nextOffset;
        }
      }
    }

这个方法吊了，先判断一下起始位置是在size一半左边还是右边，如果是右边，从链表尾部开始循环读取，如果是在左边，从链表头开始读取，这又是优化有木有，看源码果然能学到很多东西