BufferedReader源码笔记

最新推荐文章于 2023-12-29 01:57:08 发布

原创最新推荐文章于 2023-12-29 01:57:08 发布 · 4.5k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#stream #exception #dst #buffer #character #input

java 专栏收录该内容

13 篇文章

订阅专栏

本文详细解析了BufferedReader的源码，重点探讨了mark和reset方法的实现原理，以及如何处理标记和回滚长度限制。通过源码分析，揭示了BufferedReader如何通过内部缓冲区提高读取效率，以及在不同情况下的数据读取策略。

怎么说呢，由于看人家源码的时候发现经常会使用BufferedReader的mark(int readAheadLimit)和reset()方法，但是自己使用的时候其实经常会出现这样或者那样的问题，所以干脆就看了源码，没有注释的源码猜人家的意思的确还是很麻烦，也许是我jdk1.1的源码看的不够多，或者本身能力有限，所以花了很长时间才完全弄明白BufferedReader的源码，并且正好也复习了下设计模式之-装饰模式,点此穿越.

当然这篇文章的由来是由于BufferedReader之mark与reset初探这篇文章的启发下采取看源码，推荐。

从jdk帮助文档里面公布出来的BufferedReader的方法也不多，详细见下：

1.mark(int readAheadLimit):注意这个方法标记的是当前读取的下一个字符串，readAheadLimit并不是标记字符的索引，而是可以回滚的长度限制。

2.markSupported(): 源代码写死了， return true.

3.read()：读取单个字符

4.read(char[] cbuf,int off,int len):将字符读入数组的某一个部分。

5.readLine():读取一个文本行

6.reset():重置，配合mark(int readAheadLimit)使用

7.skip(long n):跳过字符

BufferedReader的作用:

从字符输入流中读取文本，缓冲各个字符，从而实现字符、数组和行的高效读取。

可以指定缓冲区的大小，或者可使用默认的大小。大多数情况下，默认值就足够大了。

BufferedReader为了实现缓冲各个字符的功能，在BufferedReader的构造器中会分配一个指定长度的内存空间给byte[] cb，然后要读取的字符都会事先存放在这个cb[]缓冲区域中，便于高效读取使用。那么BufferedReader是如何维护这个cb[]的呢？那要仔细研究下BufferedReader类的一个私有方法fill().代码见下，已加了中文注释.

/** * Fill the input buffer, taking the mark into account if it is valid. * 根据cb[]的长度从指定流(in)中读取数据.如果in中的数据流长度比cb.length还要长，那么会分段读取in里面的数据， * 在该类没有被标记markedChar<=UNMARKED的情况下每一次读取cb.length长度,直到读取完毕，当markedChar被标记后， * 从in读取的长度为cb.length-标记字符串的长度(delta = nextChar - markedChar). */ private void fill() throws IOException { /* dst:指定从in读取数据从cb的某个位置开始存放 */ int dst; if (markedChar <= UNMARKED) { /* No mark */ /* 没有标记，从0开始读 */ dst = 0; } else { /* Marked */ /* 当标记的情况下,fill()方法执行的时候的区别就是要从被标记的字符开始保留原缓冲区的内容,以便回滚， * 如果不保留，那么在cb[]缓存中，会使用新的下一组数据代替就数据，这样当调用reset()的时候 * 虽然能将nextChar还愿到标记值，但是cb[nextChar]已经变了。 */ /* delta表示要需要回滚的实际长度 ,在这里其实表示需要保留的字符串长度*/ int delta = nextChar - markedChar; /* readAheadLimit表示设置的可回滚的长度 */ if (delta >= readAheadLimit) { /* Gone past read-ahead limit: Invalidate mark */ /* 当实际要回滚的长度超过了预期设置的可回滚长度时候，此标记设置为无效! 并且会将cb[]里的数据重新刷新，获取下一组*/ markedChar = INVALIDATED; readAheadLimit = 0; dst = 0; } else { if (readAheadLimit <= cb.length) { /* Shuffle in the current buffer */ /* 当回滚的长度小于cb[]长度的时候，根据回滚长度保存这些字符，并且置于cb[]的前面，cb[]剩余的字节从in里面获取下一部分直至读取完毕*/ System.arraycopy(cb, markedChar, cb, 0, delta); markedChar = 0; dst = delta; } else { /* Reallocate buffer to accommodate read-ahead limit */ /* 当设置的可回滚长度比流实际长度大的时候,分配指定长度的内存byte[] */ /* 当回滚的长度大于cb[]长度的时候，首先重新分配一个readAheadLimit的缓存， * 将需要回滚的数据保存在新缓存的前面，剩余部分从in里面获取下一部分直至读取完毕 **/ char ncb[] = new char[readAheadLimit]; System.arraycopy(cb, markedChar, ncb, 0, delta); cb = ncb; markedChar = 0; dst = delta; } /* 如果保留了标记部分的字符串，但是在没有被reset之前，nextChar还是从in新刷去过来部分的流的索引开始.delta表示的是标记部分字符串的长度 * 之所以要使用nextChar=nChars是为了表示当前没获取新的内容之前，已经到缓存的尾了. * */ nextChar = nChars = delta; } } int n; do { //dst=0表示cb[]字段全部更新，dst>0表示标记字符串的长度. n = in.read(cb, dst, cb.length - dst); } while (n == 0); if (n > 0) { nChars = dst + n; nextChar = dst; } }

伪代码流程是:

1.判断BufferedReader对象中的缓冲流的数据是否有被标记

2.如果没有被标记转向第8步

3.如果被标记，计算下一个读取字符索引与标记字符索引的差值delta，这个差值就是当调用reset的时候回滚的长度

4.判断delta是否超过了mark(int readAheadLimit)中设置的readAheadLimit这个最大能回滚长度的值了，超过了则将标记置为无效，转向到第8步

5.判断指定的readAheadLimit的值是否有超过缓冲区长度sb.length，没有超过转到第六步，超过了转到第七步

6.将标记的字符的索引开始的长度为delta的字符保留，并且把他们设置在cb的最前面

7.分配一个长度为readAheadLimit的缓冲区，然后将标记的字符的索引开始的长度为delta的字符保留，并且把他们设置在新缓冲区的最前面,然后将新缓冲区赋给cb

8.如果没有被标记或者标记为无效的情况下，那么cb里面的数据都是无效的，所以从in里面获取的数据从cb的开始位置进行存放，如果被标记并且标记为有效的情况下，那么经过第6步或者第7步已经将那些有必要的数据存放到了cb的前面，从in里面或者的数据只能存放在这个必要数据的后面。而dst就是控制这个从cb那个位置存放新数据的变量。

9.查看从in里面获取到的新数据的长度，如果是-1表示数据源已经读取完毕了，否则设置nchars的长度，nchars其实就是cb里面有效数据的长度.

10.结束.

PS:在写这个blog的时候，我突然想到，如果mark中的readAheadLimit这个参数不控制回滚长度，也就是fill()方法中没有

if (delta >= readAheadLimit) { /* Gone past read-ahead limit: Invalidate mark */ /* 当实际要回滚的长度超过了预期设置的可回滚长度时候，此标记设置为无效! 并且会将cb[]里的数据重新刷新，获取下一组*/ markedChar = INVALIDATED; readAheadLimit = 0; dst = 0; }

这个判断语句(你可以将源码代码复制出来后注释掉这个判断然后再运行下面的语句)，那我们假设一个场景

String testStr = "abcdefghijklmnob"; BufferedReader br = new BufferedReader(new StringReader(testStr),5); br.mark(5); int c=0; while ((c = br.read()) != -1) { System.out.println((char)c); }

这段语句会造成死循环，因为标记的缓冲区最前面那个，而且回滚的长度正好是缓冲区的长度，所以在伪代码流程的第6步中，其实cb已经没有空余的空间去获取新的数据，所以n = in.read(cb, dst, cb.length - dst);必然为0，结果就死循环了。所以个人认为readAheadLimit这个值其实最重要的功能就是保证在调用fill()这个方法的时候，缓存区有剩余的空间去获取新的内容，如果你非要使你的回滚长度等同于缓冲区长度，那么readAheadLimit的作用就变成了把缓冲区的长度扩充，以便有新的空间去容纳新的数据，否则即使没有死循环你也读取不到新的内容了。

另外对于

do {

//dst=0表示cb[]字段全部更新，dst>0表示标记字符串的长度.

n = in.read(cb, dst, cb.length - dst);

} while (n == 0);

暂时不明白，至少我在看BufferedReader装饰StringReader这个类的时候没看到效果。

BufferedReader的源码所有中文注释如下(这是我的笔记~)：

package think.in.java.io.file; import java.io.IOException; import java.io.Reader; /* * @(#)BufferedReader.java 1.33 04/01/12 * * Copyright 2004 Sun Microsystems, Inc. All rights reserved. * SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms. */ /** * Read text from a character-input stream, buffering characters so as to * provide for the efficient reading of characters, arrays, and lines. * * * The buffer size may be specified, or the default size may be used. The * default is large enough for most purposes. * * * In general, each read request made of a Reader causes a corresponding read * request to be made of the underlying character or byte stream. It is * therefore advisable to wrap a BufferedReader around any Reader whose read() * operations may be costly, such as FileReaders and InputStreamReaders. For * example, * * <pre> * BufferedReader in = new BufferedReader(new FileReader("foo.in")); * </pre> * * will buffer the input from the specified file. Without buffering, each * invocation of read() or readLine() could cause bytes to be read from the * file, converted into characters, and then returned, which can be very * inefficient. * * * Programs that use DataInputStreams for textual input can be localized by * replacing each DataInputStream with an appropriate BufferedReader. * * @see FileReader * @see InputStreamReader * * @version 1.33, 04/01/12 * @author Mark Reinhold * @since JDK1.1 */ public class BufferedReader extends Reader { private Reader in; private char cb[]; private int nChars, nextChar; private static final int INVALIDATED = -2; private static final int UNMARKED = -1; private int markedChar = UNMARKED; private int readAheadLimit = 0; /* Valid only when markedChar > 0 */ /** If the next character is a line feed, skip it */ private boolean skipLF = false; /** The skipLF flag when the mark was set */ private boolean markedSkipLF = false; private static int defaultCharBufferSize = 8192; private static int defaultExpectedLineLength = 80; /** * Create a buffering character-input stream that uses an input buffer of * the specified size. * * @param in * A Reader * @param sz * Input-buffer size * * @exception IllegalArgumentException * If sz is <= 0 */ public BufferedReader(Reader in, int sz) { super(in); if (sz <= 0) throw new IllegalArgumentException("Buffer size <= 0"); this.in = in; cb = new char[sz]; nextChar = nChars = 0; } /** * Create a buffering character-input stream that uses a default-sized input * buffer. * * @param in * A Reader */ public BufferedReader(Reader in) { this(in, defaultCharBufferSize); } /** Check to make sure that the stream has not been closed */ private void ensureOpen() throws IOException { if (in == null) throw new IOException("Stream closed"); } /** * Fill the input buffer, taking the mark into account if it is valid. * 根据cb[]的长度从指定流(in)中读取数据.如果in中的数据流长度比cb.length还要长，那么会分段读取in里面的数据， * 在该类没有被标记markedChar<=UNMARKED的情况下每一次读取cb.length长度,直到读取完毕，当markedChar被标记后， * 从in读取的长度为cb.length-标记字符串的长度(delta = nextChar - markedChar). */ private void fill() throws IOException { /* dst:指定从in读取数据从cb的某个位置开始存放 */ int dst; if (markedChar <= UNMARKED) { /* No mark */ /* 没有标记，从0开始读 */ dst = 0; } else { /* Marked */ /* 当标记的情况下,fill()方法执行的时候的区别就是要从被标记的字符开始保留原缓冲区的内容,以便回滚， * 如果不保留，那么在cb[]缓存中，会使用新的下一组数据代替就数据，这样当调用reset()的时候 * 虽然能将nextChar还愿到标记值，但是cb[nextChar]已经变了。 */ /* delta表示要需要回滚的实际长度 ,在这里其实表示需要保留的字符串长度*/ int delta = nextChar - markedChar; /* readAheadLimit表示设置的可回滚的长度 */ if (delta >= readAheadLimit) { /* Gone past read-ahead limit: Invalidate mark */ /* 当实际要回滚的长度超过了预期设置的可回滚长度时候，此标记设置为无效! 并且会将cb[]里的数据重新刷新，获取下一组*/ markedChar = INVALIDATED; readAheadLimit = 0; dst = 0; } else { if (readAheadLimit <= cb.length) { /* Shuffle in the current buffer */ /* 当回滚的长度小于cb[]长度的时候，根据回滚长度保存这些字符，并且置于cb[]的前面，cb[]剩余的字节从in里面获取下一部分直至读取完毕*/ System.arraycopy(cb, markedChar, cb, 0, delta); markedChar = 0; dst = delta; } else { /* Reallocate buffer to accommodate read-ahead limit */ /* 当设置的可回滚长度比流实际长度大的时候,分配指定长度的内存byte[] */ /* 当回滚的长度大于cb[]长度的时候，首先重新分配一个readAheadLimit的缓存， * 将需要回滚的数据保存在新缓存的前面，剩余部分从in里面获取下一部分直至读取完毕 **/ char ncb[] = new char[readAheadLimit]; System.arraycopy(cb, markedChar, ncb, 0, delta); cb = ncb; markedChar = 0; dst = delta; } /* 如果保留了标记部分的字符串，但是在没有被reset之前，nextChar还是从in新刷去过来部分的流的索引开始.delta表示的是标记部分字符串的长度 * 之所以要使用nextChar=nChars是为了表示当前没获取新的内容之前，已经到缓存的尾了. * */ nextChar = nChars = delta; } } int n; do { //dst=0表示cb[]字段全部更新，dst>0表示标记字符串的长度. n = in.read(cb, dst, cb.length - dst); } while (n == 0); if (n > 0) { nChars = dst + n; nextChar = dst; } } /** * Read a single character. * * @return The character read, as an integer in the range 0 to 65535 ( * <tt>0x00-0xffff</tt>), or -1 if the end of the stream has been * reached * @exception IOException * If an I/O error occurs */ public int read() throws IOException { synchronized (lock) { ensureOpen(); for (;;) { if (nextChar >= nChars) { fill(); if (nextChar >= nChars) return -1; } if (skipLF) { skipLF = false; if (cb[nextChar] == '/n') { nextChar++; continue; } } return cb[nextChar++]; } } } /** * Read characters into a portion of an array, reading from the underlying * stream if necessary. */ private int read1(char[] cbuf, int off, int len) throws IOException { if (nextChar >= nChars) { /* * If the requested length is at least as large as the buffer, and * if there is no mark/reset activity, and if line feeds are not * being skipped, do not bother to copy the characters into the * local buffer. In this way buffered streams will cascade * harmlessly. */ if (len >= cb.length && markedChar <= UNMARKED && !skipLF) { return in.read(cbuf, off, len); } fill(); } if (nextChar >= nChars) return -1; if (skipLF) { skipLF = false; if (cb[nextChar] == '/n') { nextChar++; if (nextChar >= nChars) fill(); if (nextChar >= nChars) return -1; } } int n = Math.min(len, nChars - nextChar); System.arraycopy(cb, nextChar, cbuf, off, n); nextChar += n; return n; } /** * Read characters into a portion of an array. * * * This method implements the general contract of the corresponding * <code>{@link Reader#read(char[], int, int) read}</code> method of the * <code>{@link Reader}</code> class. As an additional convenience, it * attempts to read as many characters as possible by repeatedly invoking * the <code>read</code> method of the underlying stream. This iterated * <code>read</code> continues until one of the following conditions becomes * true: * <ul> * * <li>The specified number of characters have been read, * * <li>The <code>read</code> method of the underlying stream returns * <code>-1</code>, indicating end-of-file, or * * <li>The <code>ready</code> method of the underlying stream returns * <code>false</code>, indicating that further input requests would block. * * </ul> * If the first <code>read</code> on the underlying stream returns * <code>-1</code> to indicate end-of-file then this method returns * <code>-1</code>. Otherwise this method returns the number of characters * actually read. * * * Subclasses of this class are encouraged, but not required, to attempt to * read as many characters as possible in the same fashion. * * * Ordinarily this method takes characters from this stream's character * buffer, filling it from the underlying stream as necessary. If, however, * the buffer is empty, the mark is not valid, and the requested length is * at least as large as the buffer, then this method will read characters * directly from the underlying stream into the given array. Thus redundant * <code>BufferedReader</code>s will not copy data unnecessarily. * * @param cbuf * Destination buffer * @param off * Offset at which to start storing characters * @param len * Maximum number of characters to read * * @return The number of characters read, or -1 if the end of the stream has * been reached * * @exception IOException * If an I/O error occurs */ public int read(char cbuf[], int off, int len) throws IOException { synchronized (lock) { ensureOpen(); if ((off < 0) || (off > cbuf.length) || (len < 0) || ((off + len) > cbuf.length) || ((off + len) < 0)) { throw new IndexOutOfBoundsException(); } else if (len == 0) { return 0; } int n = read1(cbuf, off, len); if (n <= 0) return n; /* 如果实际读取的字符长度小于预期读取的长度，说明已经读取到当前缓存的尾了，但是并不代表是真的尾，还需要使用fill()重新刷新 * 看是否由于流太长，导致第一次的默认缓存装不下，只有使用第二次再次加载到缓存，然后重新读取. **/ while ((n < len) && in.ready()) { int n1 = read1(cbuf, off + n, len - n); if (n1 <= 0) break; n += n1; } return n; } } /** * Read a line of text. A line is considered to be terminated by any one of * a line feed ('/n'), a carriage return ('/r'), or a carriage return * followed immediately by a linefeed. * * @param ignoreLF * If true, the next '/n' will be skipped * * @return A String containing the contents of the line, not including any * line-termination characters, or null if the end of the stream has * been reached * * @see java.io.LineNumberReader#readLine() * * @exception IOException * If an I/O error occurs * * ignore:忽略 * omit:省略 */ String readLine(boolean ignoreLF) throws IOException { StringBuffer s = null; int startChar; boolean omitLF = ignoreLF || skipLF; synchronized (lock) { ensureOpen(); bufferLoop: for (;;) { if (nextChar >= nChars) fill(); if (nextChar >= nChars) { /* EOF end of file? */ if (s != null && s.length() > 0) return s.toString(); else return null; } /* end of line variable */ boolean eol = false; char c = 0; int i; /* Skip a leftover '/n', if necessary */ /* 当是/r/n(回车换行的)的时候，可以去除/n这个多余的玩意 */ if (omitLF && (cb[nextChar] == '/n')) nextChar++; skipLF = false; omitLF = false; charLoop: for (i = nextChar; i < nChars; i++) { c = cb[i]; if ((c == '/n') || (c == '/r')) { eol = true; break charLoop; } } startChar = nextChar; nextChar = i; if (eol) { String str; if (s == null) { str = new String(cb, startChar, i - startChar); } else { s.append(cb, startChar, i - startChar); str = s.toString(); } nextChar++; if (c == '/r') { skipLF = true; } return str; } if (s == null) s = new StringBuffer(defaultExpectedLineLength); s.append(cb, startChar, i - startChar); } } } /** * Read a line of text. A line is considered to be terminated by any one of * a line feed ('/n'), a carriage return ('/r'), or a carriage return * followed immediately by a linefeed. * * @return A String containing the contents of the line, not including any * line-termination characters, or null if the end of the stream has * been reached * * @exception IOException * If an I/O error occurs */ public String readLine() throws IOException { // return readLine(false); return readLine(true); } /** * Skip characters. * * @param n * The number of characters to skip * * @return The number of characters actually skipped * * @exception IllegalArgumentException * If <code>n</code> is negative. * @exception IOException * If an I/O error occurs */ public long skip(long n) throws IOException { if (n < 0L) { throw new IllegalArgumentException("skip value is negative"); } synchronized (lock) { ensureOpen(); long r = n; while (r > 0) { if (nextChar >= nChars) fill(); if (nextChar >= nChars) /* EOF */ break; if (skipLF) { skipLF = false; if (cb[nextChar] == '/n') { nextChar++; } } long d = nChars - nextChar; if (r <= d) { nextChar += r; r = 0; break; } else { r -= d; nextChar = nChars; } } return n - r; } } /** * Tell whether this stream is ready to be read. A buffered character stream * is ready if the buffer is not empty, or if the underlying character * stream is ready. * * @exception IOException * If an I/O error occurs */ public boolean ready() throws IOException { synchronized (lock) { ensureOpen(); /* * If newline needs to be skipped and the next char to be read is a * newline character, then just skip it right away. */ if (skipLF) { /* * Note that in.ready() will return true if and only if the next * read on the stream will not block. */ if (nextChar >= nChars && in.ready()) { fill(); } if (nextChar < nChars) { if (cb[nextChar] == '/n') nextChar++; skipLF = false; } } return (nextChar < nChars) || in.ready(); } } /** * Tell whether this stream supports the mark() operation, which it does. */ public boolean markSupported() { return true; } /** * Mark the present position in the stream. Subsequent calls to reset() will * attempt to reposition the stream to this point. * * @param readAheadLimit * Limit on the number of characters that may be read while still * preserving the mark. After reading this many characters, * attempting to reset the stream may fail. A limit value larger * than the size of the input buffer will cause a new buffer to * be allocated whose size is no smaller than limit. Therefore * large values should be used with care. * * @exception IllegalArgumentException * If readAheadLimit is < 0 * @exception IOException * If an I/O error occurs */ public void mark(int readAheadLimit) throws IOException { /* readAheadLimit:在仍保留该标记的情况下，对可读取字符数量的限制. */ if (readAheadLimit < 0) { throw new IllegalArgumentException("Read-ahead limit < 0"); } synchronized (lock) { ensureOpen(); this.readAheadLimit = readAheadLimit; /* 标记当前点 */ markedChar = nextChar; /* 标记是否换行 */ markedSkipLF = skipLF; } } /** * Reset the stream to the most recent mark. * * @exception IOException * If the stream has never been marked, or if the mark has * been invalidated */ public void reset() throws IOException { synchronized (lock) { ensureOpen(); if (markedChar < 0) throw new IOException( (markedChar == INVALIDATED) ? "Mark invalid" : "Stream not marked"); nextChar = markedChar; skipLF = markedSkipLF; } } /** * Close the stream. * * @exception IOException * If an I/O error occurs */ public void close() throws IOException { synchronized (lock) { if (in == null) return; in.close(); in = null; cb = null; } } }