【源码 JDK 1.8】String的源码解读

原创
于 2024-01-11 22:18:48 发布 · 502 阅读
11 ·
CC 4.0 BY-SA版权
文章标签：
#java
/**
* The {@code String} class represents character strings.
 * 这个类是用来表示一个字符串
 * All string literals in Java programs, such as {@code "abc"}, are implemented as instances of this class.
 * 在Java程序中，所有的String字面量例如"abc"，都作为该类的一个实例实现
 * 
 * Strings are constant; their values cannot be changed after they
 * are created. String buffers support mutable strings.
 * String是一个常数，当他们被初始化之后，值就不能被改变。如果需要使用可变的字符串，可以使用String buffers
 * 
 * Because String objects are immutable they can be shared. For example:
 * String str = "abc";
 * 因为String是不可变的，它们可以共享。下面代码中会共用同一个字符串对象，即str和"abc"引用的是同一个字符串
 * 
 *     
 * 
 * is equivalent to:
 *     char data[] = {'a', 'b', 'c'};
 *     String str = new String(data);
 * 上面这段代码这里虽然说是等价的，是对于字符串的值而言的，实际上使用了new 产生的str对象会在堆内存里面而不是从常量池中取。
 * 这样很容易产生混淆，大部分的编码规范要求String的比较相等强制使用equals
 * 
 * Here are some more examples of how strings can be used:
 * <blockquote><pre>
 *     System.out.println("abc");
 *     String cde = "cde";
 *     System.out.println("abc" + cde);
 *     String c = "abc".substring(2,3);
 *     String d = cde.substring(1, 2);
 * </pre></blockquote>
 * <p>
 * 
 * The class {@code String} includes methods 
 * String类包含这些基本方法
 * for examining individual characters of the sequence, 
 * 1、检查序列中的单个字符：例如,charAt(int index) 方法可以返回指定索引位置的字符
 * 
 * for comparing strings, 
 * 2、比较字符串：例如，equals(Object anObject)方法，比较两个字符串是否相同
 * 
 * for searching strings, 
 * 3、搜索字符串：例如，indexOf(int ch)和indexOf(String substring)方法可以在字符串中查找特定的字符或者子字符串位置
 * 
 * for extracting substrings, 
 * 4、提取子字符串：例如，substring(int beginIndex)和substring(int beginIndex, int endIndex)方法可以从原始字符串中提取子字符串。
 * 
 * and for creating a copy of a string with all characters translated to uppercase or to lowercase. 
 * 5、将字符串中的所有字符转换为大写或小写：例如，toUpperCase()和toLowerCase()方法可以将字符串中的所有字符转换为大写或小写。
 * 
 * Case mapping is based on the Unicode Standard version
 * specified by the {@link java.lang.Character Character} class.
 * 其中，字符的大小写转换是基于Unicode标准进行的，具体的版本由Character类指定。
 * 
 * <p>
 * The Java language provides special support for the string concatenation operator (&nbsp;+&nbsp;), and for conversion of other objects to strings. 
 * Java语言支持使用（ + ）进行符号拼接，上面"&nbsp;"在html中展示为空格
 * 
 * String concatenation is implemented through the {@code StringBuilder}(or {@code StringBuffer}) class and its {@code append} method.
 * String 支持使用 StringBuilder和StringBuffer两个Class的append()方法进行拼接；更推荐使用这两个类进行凭借，特别是使用循环处理String的时候
 * 
 * String conversions are implemented through the method {@code toString}, defined by {@code Object} and inherited by all classes in Java. 
 * 转换String可以通过toString方法，这个方法定义在Object中并且被Java所有的类继承
 * 
 * For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.
 * 更多的信息可以参考《The Java Language Specification》
 * 
 * Unless otherwise noted, passing a <tt>null</tt> argument to a constructor or method in this class will cause a {@link NullPointerException} to be
 * thrown.
 * 除非另外指出，如果使用null来调用这个类的方法，将会抛出NullPointerException异常
 *
 * A {@code String} represents a string in the UTF-16 format in which 
 * <em>supplementary characters</em> are represented by <em>surrogate pairs</em> 
 * Java中String是使用UTF-16编码。关于编码的问题，后续在编码相关文字进行详细的探讨，这里要说明，Java使用UTF-16是一个历史遗留，虽然现在UTF-8更流行且更为兼容
 * 
 * (see the section <a href="Character.html#unicode">Unicode Character Representations</a> in the {@code Character} class for more information).
 * 可以通过查看Character类中关于Unicode字符表示的章节，以获取更多信息
 * 当String使用补充字符的时候，charAt(int index)会获取道什么值呢？索引0会返回补充字符的高代理（high surrogate），而索引1会返回补充字符的低代理（low surrogate）
 * 
 * Index values refer to {@code char} code units, so a supplementary character uses two positions in a {@code String}.
 * index 索引只想的是char的代码单元，所以使用补充字符的时候，由于使用的是两个代码字符表示，所以他会占用两个索引位置
 * 
 * 
 * The {@code String} class provides methods for dealing with Unicode code points (i.e., characters), 
 * in addition to those for dealing with Unicode code units (i.e., {@code char} values).
 * Java中的String类除了提供处理Unicode代码单元（即char值）的方法外，还提供了处理Unicode码点（即字符）的方法
 * 例如上文所说的使用charAt(0)会返回补充字符的高代理（high surrogate），而使用codePointAt(0)会返回完整的码点
 * 
 * @author  Lee Boynton
 * @author  Arthur van Hoff
 * @author  Martin Buchholz
 * @author  Ulf Zibis
 * @see     java.lang.Object#toString() 所有的类都可以通过toString()方法转换为String，具体要看如何实现Object的toString()
 * @see     java.lang.StringBuffer 处理可变长度String的，一般用于循环内处理String，线程安全
 * @see     java.lang.StringBuilder 处理可变长度String的，一般用于循环内处理String，线程不安全
 * @see     java.nio.charset.Charset Java的核心类，表示字符集
 * @since   JDK1.0
 */

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
   
   
    /** 
     * The value is used for character storage. 
     * 用于存储字符串的value，按照推荐的数组定义风格可能会写成：
     * char[] value;
     * 
     */
    private final char value[];

    /** 
     * Cache the hash code for the string 
     * 缓存String hash code，调用hashCode()方法时，如果hash的值不是0，则直接返回缓存值
     */
    private int hash; // Default to 0

    /** 
     * use serialVersionUID from JDK 1.0.2 for interoperability
     */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * Class String is special cased within the Serialization Stream Protocol.
     * 类字符串在序列化流协议中有特殊情况。
     *
     * A String instance is written into an ObjectOutputStream according to
     * <a href="{@docRoot}/../platform/serialization/spec/output.html">
     * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
     * 
     * 这段代码定义了一个空的ObjectStreamField数组，意味着它不指定任何自定义的序列化字段。这意味着所有的默认字段都会被序列化。
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

    /**
     * Initializes a newly created {@code String} object so that it represents
     * an empty character sequence.  Note that use of this constructor is
     * unnecessary since Strings are immutable.
     * 
     * 初始化一个""的新String对象，这个方法是不推荐使用的，因为String是不可变的
     */
    public String() {
   
   
        this.value = "".value;
    }

    /**
     * Initializes a newly created {@code String} object so that it represents
     * the same sequence of characters as the argument; in other words, the
     * newly created string is a copy of the argument string. Unless an
     * explicit copy of {@code original} is needed, use of this constructor is
     * unnecessary since Strings are immutable.
     * 复制一个新的String对象，这个方法也是不推荐使用的，因为String是不可变的
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
   
   
        this.value = original.value;
        this.hash = original.hash;
    }

    /**
     * Allocates a new {@code String} so that it represents the sequence of
     * characters currently contained in the character array argument. The
     * contents of the character array are copied; subsequent modification of
     * the character array does not affect the newly created string.
     * 创建一个新的String对象，该对象表示字符数组中的字符序列
     * 并且这个新创建的String对象与字符数组是独立的，修改字符数组不会影响这个String对象。
     * 
     * @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
   
   
        // 这段代码可以看出，复制了一个新的数组，所有后面对字符数组的修改不会影响新创建的String对象。
        this.value = Arrays.copyOf(value, value.length);
    }

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the character array argument. The {@code offset} argument is the
     * index of the first character of the subarray and the {@code count}
     * argument specifies the length of the subarray. The contents of the
     * subarray are copied; subsequent modification of the character array does
     * not affect the newly created string.
     * 创建一个新的String对象，该对象表示字符数组中的一段字符序列
     * 并且这个新创建的String对象与字符数组是独立的，修改字符数组不会影响这个String对象。
     * 
     * 
     * @param  value
     *         Array that is the source of characters
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code value} array
     */
    public String(char value[], int offset, int count) {
   
   
        if (offset < 0) {
   
   
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
   
   
            if (count < 0) {
   
   
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
   
   
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        /**
         * 这里-1>>>1.是值Integer.MAX_VALUE，这一段注释的意思是，下面的if在逻辑上有两种写法
         * 1、offset + count > value.length
         * 2、offset > value.length - count
         * 第一种很直观，一眼就看出偏移量不能大于值的长度，但是使用+可能导致int的溢出
         * 使用 value.length - count能避免int的溢出
         * 同样的，取两个数之间的中间值
         * 方法一：mid = (l + r) / 2 : 容易溢出
         * 方法二：mid = l + (r - l) / 2
         */
        if (offset > value.length - count) {
   
   
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

    /**
     * Allocates a new {@code String} that contains characters from a subarray
     * of the <a href="Character.html#unicode">Unicode code point</a> array
     * argument.  The {@code offset} argument is the index of the first code
     * point of the subarray and the {@code count} argument specifies the
     * length of the subarray.  The contents of the subarray are converted to
     * {@code char}s; subsequent modification of the {@code int} array does not
     * affect the newly created string.
     * 通过Unicode码点数组生成String，不是很常用
     *
     * @param  codePoints
     *         Array that is the source of Unicode code points
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IllegalArgumentException
     *          If any invalid Unicode code point is found in {@code
     *          codePoints}
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code codePoints} array
     *
     * @since  1.5
     */
    public String(int[] codePoints, int offset, int count) {
   
   
        if (offset < 0) {
   
   
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
   
   
            if (count < 0) {
   
   
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
   
   
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > codePoints.length - count) {
   
   
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: Compute precise size of char[]
        // 1.计算字符长度，通常一个码点对应一个字符,但是扩展外的是由high surrogate和low surrogat组成的，所以要增加
        int n = count;
        for (int i = offset; i < end; i++) {
   
   
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                continue;
            else if (Character.isValidCodePoint(c))
                // 扩展外的合法字符
                n++;
            // 存在不合法的字符，直接报错
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: Allocate and fill in char[]
        // 2.分配并填充一个数组
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
   
   
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
                v[j] = (char)c;
            else
                // 处理扩展外字符
                Character.toSurrogates(c, v, j++);
        }

        this.value = v;
    }

    /**
     * Allocates a new {@code String} constructed from a subarray of an array
     * of 8-bit integer values.
     *
     * <p> The {@code offset} argument is the index of the first byte of the
     * subarray, and the {@code count} argument specifies the length of the
     * subarray.
     *
     * <p> Each {@code byte} in the subarray is converted to a {@code char} as
     * specified in the method above.
     *
     * @deprecated This method does not properly convert bytes into characters.
     * As of JDK&nbsp;1.1, the preferred way to do this is via the
     * {@code String} constructors that take a {@link
     * java.nio.charset.Charset}, charset name, or that use the platform's
     * default charset.
     *
     * @param  ascii
     *         The bytes to be converted to characters
     *
     * @param  hibyte
     *         The top 8 bits of each 16-bit Unicode code unit
     *
     * @param  offset
     *         The initial offset
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} or {@code count} argument is invalid
     *
     * @see  #String(byte[], int)
     * @see  #String(byte[], int, int, java.lang.String)
     * @see  #String(byte[], int, int, java.nio.charset.Charset)
     * @see  #String(byte[], int, int)
     * @see  #String(byte[], java.lang.String)
     * @see  #String(byte[], java.nio.charset.Charset)
     * @see  #String(byte[])
     * 已废弃的方法，主要是因为没有明确字符编码，可能导致不同的Java环境得到不同的结果
     */
    @Deprecated
    public String(byte ascii[], int hibyte, int offset, int count) {
   
   
        checkBounds(ascii, offset, count);
        char value[] = new char[count];

        if (hibyte == 0) {
   
   
            for (int i = count; i-- > 0;) {
   
   
                value[i] = (char)(ascii[i + offset] & 0xff);
            }
        } else {
   
   
            hibyte <<= 8;
            for (int i = count; i-- > 0;) {
   
   
                value[i] = (char)(hibyte | (ascii[i + offset] & 0xff));
            }
        }
        this.value = value;
    }

    /**
     * Allocates a new {@code String} containing characters constructed from
     * an array of 8-bit integer values. Each character <i>c</i>in the
     * resulting string is constructed from the corresponding component
     * <i>b</i> in the byte array such that:
     *
     * <blockquote><pre>
     *     <b><i>c</i></b> == (char)(((hibyte &amp; 0xff) &lt;&lt; 8)
     *                         | (<b><i>b</i></b> &amp; 0xff))
     * </pre></blockquote>
     *
     * @deprecated  This method does not properly convert bytes into
     * characters.  As of JDK&nbsp;1.1, the preferred way to do this is via the
     * {@code String} constructors that take a {@link
     * java.nio.charset.Charset}, charset name, or that use the platform's
     * default charset.
     *
     * @param  ascii
     *         The bytes to be converted to characters
     *
     * @param  hibyte
     *         The top 8 bits of each 16-bit Unicode code unit
     *
     * @see  #String(byte[], int, int, java.lang.String)
     * @see  #String(byte[], int, int, java.nio.charset.Charset)
     * @see  #String(byte[], int, int)
     * @see  #String(byte[], java.lang.String)
     * @see  #String(byte[], java.nio.charset.Charset)
     * @see  #String(byte[])
     * 已废弃的方法，主要是因为没有明确字符编码，可能导致不同的Java环境得到不同的结果
     */
    @Deprecated
    public String(byte ascii[], int hibyte) {
   
   
        this(ascii, hibyte, 0, ascii.length);
    }

    /* Common private utility method used to bounds check the byte array
     * and requested offset & length values used by the String(byte[],..)
     * constructors.
     */
    private static void checkBounds(byte[] bytes, int offset, int length) {
   
   
        if (length < 0)
            throw new StringIndexOutOfBoundsException(length);
        if (offset < 0)
            throw new StringIndexOutOfBoundsException(offset);
        if (offset > bytes.length - length)
            throw new StringIndexOutOfBoundsException(offset + length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified charset.  The length of the new {@code String}
     * is a function of the charset, and hence may not be equal to the length
     * of the subarray.
     * 使用指定的一个字符集将byte数组转换为String
     * 新创建的String的长度可能会因为字符集的不同而与原始字节子数组的长度不同。例如，在使用UTF-8编码时，一个字符可能由多个字节表示。
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     * 如果给定的字节在指定的字符集中无效，这个构造方法的行为是不确定的。这意味着它可能不会抛出异常，但也可能产生不可预测的结果
     * 如果需要更多的控制权来处理解码过程，应该使用java.nio.charset.CharsetDecoder类。这个类提供了更细致、更强大的字符解码功能。
     *
     * @param  bytes 需要解码为字符的字节
     *         The bytes to be decoded into characters
     *
     * @param  offset 要解码的第一个字节索引
     *         The index of the first byte to decode
     *
     * @param  length 要解码的字节数
     *         The number of bytes to decode

     * @param  charsetName 使用的字符集
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
   
   
        // 1.如果字符集为null，则抛出异常
        if (charsetName == null)
            throw new NullPointerException("charsetName");

        // 2.检查字节数组、偏移量和长度是否有效，如果无效则抛出异常

        checkBounds(bytes, offset, length);
        // 3.调用了StringCoding.decode方法，使用指定的字符集名称、字节数组、偏移量和长度来解码字节，并将结果存储在实例变量value中
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the subarray.
     * 使用指定的一个字符集将byte数组转换为String
     * 新创建的String的长度可能会因为字符集的不同而与原始字节子数组的长度不同。例如，在使用UTF-8编码时，一个字符可能由多个字节表示。
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     * 当遇到格式不正确的输入或无法映射的字符序列时，该方法会使用该字符集的默认替换字符串来代替它们。
     * 如果需要更精细地控制解码过程，应该使用java.nio.charset.CharsetDecoder类。
     *
     * @param  bytes 需要解码为字符的字节
     *         The bytes to be decoded into characters
     *
     * @param  offset 要解码的第一个字节索引
     *         The index of the first byte to decode
     *
     * @param  length 要解码的字节数
     *         The number of bytes to decode
     *
     * @param  charset 使用的字符集
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  1.6
     */
    public String(byte bytes[], int offset, int length, Charset charset) {
   
   
        // 1.如果传入的字符集对象是null，则抛出一个NullPointerException异常，异常信息为"charset"
        if (charset == null)
            throw new NullPointerException("charset");

        // 2.检查字节数组、偏移量和长度是否有效，如果无效则抛出异常
        checkBounds(bytes, offset, length);

        // 3.调用了StringCoding.decode方法，使用指定的字符集名称、字节数组、偏移量和长度来解码字节，并将结果存储在实例变量value中
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the specified {@linkplain java.nio.charset.Charset charset}.  The
     * length of the new {@code String} is a function of the charset, and hence
     * may not be equal to the length of the byte array.
     * 使用指定的一个字符集将byte数组转换为String
     * 新创建的String的长度可能会因为字符集的不同而与原始字节子数组的长度不同。例如，在使用UTF-8编码时，一个字符可能由多个字节表示。
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     * 如果给定的字节在指定的字符集中无效，这个构造方法的行为是不确定的。这意味着它可能不会抛出异常，但也可能产生不可预测的结果
     * 如果需要更多的控制权来处理解码过程，应该使用java.nio.charset.CharsetDecoder类。这个类提供了更细致、更强大的字符解码功能。
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
   
   
        this(bytes, 0, bytes.length, charsetName);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the byte array.
     * 使用指定的一个字符集将byte数组转换为String
     * 新创建的String的长度可能会因为字符集的不同而与原始字节子数组的长度不同。例如，在使用UTF-8编码时，一个字符可能由多个字节表示。
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     * 当遇到格式不正确的输入或无法映射的字符序列时，该方法会使用该字符集的默认替换字符串来代替它们。
     * 如果需要更精细地控制解码过程，应该使用java.nio.charset.CharsetDecoder类。
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @since  1.6
     */
    public String(byte bytes[], Charset charset) {
   
   
        this(bytes, 0, bytes.length, charset);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the platform's default charset.  The length of the new
     * {@code String} is a function of the charset, and hence may not be equal
     * to the length of the subarray.
     * 用于创建一个新的String对象，该对象表示由给定的字节数组下解码得到的字符序列。
     * 解码过程中使用的是平台默认的字符集
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and the {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length) {
   
   
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the platform's default charset.  The length of the new {@code
     * String} is a function of the charset, and hence may not be equal to the
     * length of the byte array.
     * 用于创建一个新的String对象，该对象表示由给定的字节数组下解码得到的字符序列。
     * 解码过程中使用的是平台默认的字符集
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @since  JDK1.1
     */
    public String(byte bytes[]) {
   
   
        this(bytes, 0, bytes.length);
    }

    /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string buffer argument. The contents of the
     * string buffer are copied; subsequent modification of the string buffer
     * does not affect the newly created string.
     * 从一个StringBuffer对象创建一个新的String对象
     * 这个构造方法将StringBuffer的内容复制到一个新的字符串中，这样对StringBuffer的后续修改不会影响已创建的字符串。
     *
     * @param  buffer
     *         A {@code StringBuffer}
     */
    public String(StringBuffer buffer) {
   
   
        // synchronized 保证buffer线程安全
        synchronized(buffer) {
   
   
            // 使用Arrays.copyOf方法复制了StringBuffer的内容到一个新的字符数组
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

    /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string builder argument. The contents of the
     * string builder are copied; subsequent modification of the string builder
     * does not affect the newly created string.
     * 从一个StringBuilder对象创建一个新的String对象。这个构造方法将StringBuilder的内容复制到一个新的字符串中，这样对StringBuilder的后续修改不会影响已创建的字符串。
     *
     * <p> This constructor is provided to ease migration to {@code
     * StringBuilder}. Obtaining a string from a string builder via the {@code
     * toString} method is likely to run faster and is generally preferred.
     * 在将StringBuilder转换为Stringd的方式中，使用toString()方法更推荐
     *
     * @param   builder
     *          A {@code StringBuilder}
     *
     * @since  1.5
     */
    public String(StringBuilder builder) {
   
   
        // 使用Arrays.copyOf方法复制了StringBuilder的内容到一个新的字符数组
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

    /**
    * Package private constructor which shares value array for speed.
    * this constructor is always expected to be called with share==true.
    * a separate constructor is needed because we already have a public
    * String(char[]) constructor that makes a copy of the given char[].
    * 包私有的方法，创建一个数组共享的String;
    * 不常用，在目前不支持 unshared
    * 
    */
    String(char[] value, boolean share) {
   
   
        // assert share : "unshared not supported";
        this.value = value;
    }

    /**
     * Returns the length of this string.
     * 返回字符串的长度
     * 
     * The length is equal to the number of <a href="Character.html#unicode">Unicode
     * code units</a> in the string.
     * 字符串的长度等于其中的Unicode码位的数量
     * 
     * @return  the length of the sequence of characters represented by this
     *          object.
     */
    public int length() {
   
   
        return value.length;
    }

    /**
     * Returns {@code true} if, and only if, {@link #length()} is {@code 0}.
     * 当且仅当字符串长度为0的时候，返回ture。
     * 一般来说这个方法不常用，更常用的时使用lang3包的StringUtils.isEmpty()方法
     *
     * @return {@code true} if {@link #length()} is {@code 0}, otherwise
     * {@code false}
     *
     * @since 1.6
     */
    public boolean isEmpty() {
   
   
        return value.length == 0;
    }

    /**
     * Returns the {@code char} value at the specified index.
     * 返回索引位置的value
     * 
     * An index ranges from {@code 0} to {@code length() - 1}.
     * 索引的范围从0到length()-1
     * 
     * The first {@code char} value of the sequence
     * is at index {@code 0}, the next at index {@code 1},
     * and so on, as for array indexing.
     *
     * If the {@code char} value specified by the index is a
     * <a href="Character.html#unicode">surrogate</a>, the surrogate
     * value is returned.
     * 如果索引位置是代理值，则会直接返回代理值
     *
     * @param      index   the index of the {@code char} value.
     * @return     the {@code char} value at the specified index of this string.
     *             The first {@code char} value is at index {@code 0}.
     * @exception  IndexOutOfBoundsException  if the {@code index}
     *             argument is negative or not less than the length of this
     *             string.
     */
    public char charAt(int index) {
   
   
        if ((index < 0) || (index >= value.length)) {
   
   
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }

    /**
     * Returns the character (Unicode code point) at the specified index. 
     * 返回该索引位置上的字符以及其对应的Unicode代码点
     * The index refers to