java基础类String源码分析

本文详细分析了Java中的String类,包括其构造器、ValueOf方法、intern机制、CharSequence相关方法、比较方法以及核心方法如equals和hashCode。通过探讨不同参数的构造方式,展示了String对象的创建和缓存原理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

目录

简介

字段

创建string

构造器

参数为string,char

参数为代码点

参数为byte

参数为stringbuilder和stringbuffer

ValueOf

intern

String的创建后的等于

CharSequence的方法

length和isEmpty

charAt

subSequence和subString

比较方法

compareTo

无视大小写的比较

基本方法

toString

equals

hashcode

得到代码点,代码点数量,代码点偏移量

得到bytes数组


简介

string是java中的字符串类

/**
 * 
 * <p>String类代表字符串。java程序中的所有字符串常量,例如"abc",是这个类的实例。
 * 
 * <p>字符串是常量,它们的值不能在被创造后改变。字符串缓冲区支持可变的字符串。
 * 字符串对象是不可变的,因为它们能被共享。例如:
 * 
 * <blockquote><pre>
 *     String str = "abc";
 * </pre></blockquote><p>
 *  等价于
 * <blockquote><pre>
 *     char data[] = {'a', 'b', 'c'};
 *     String str = new String(data);
 * </pre></blockquote><p>
 * 下面有更多的字符串如何被使用的例子:
 * <blockquote><pre>
 *     System.out.println("abc");
 *     String cde = "cde";
 *     System.out.println("abc" + cde);
 *     String c = "abc".substring(2,3);
 *     String d = cde.substring(1, 2);
 * </pre></blockquote>
 * 
 * <p>String类包含了各种方法,包括检查序列中单独的字符,比较字符串,
 * 查询字符串,抽取子序列,创造一个所有字符变为大小或小写的字符串副本。
 * 大小写映射基于Character类指定的Unicode的标准版本。
 * 
 * <p>java语言为字符串连接符号(+)和其他对象转为字符串提供了特殊的支持。
 * 字符串连接由StringBuilder或者StringBuffer类和它们的方法来实现。
 * 字符串转换由方法toString来实现,这个方法被Ojbect定义,并且被java中所有的类继承,可以看java语言规范。
 * 
 * <p>除非另有说明,传入这个类的构造器或方法一个null参数会导致抛出NullPointerException
 * 
 * <p>字符串代表UTF-16格式的字符串,其中补充字符有代理对标识(看Character类的Unicode Character Representation)。
 * 索引值对应char的代码单元,所以一个补充字符在String中占用两个位置(两个代码单元)。
 * 而正常字符和补充字符都对应一个代码点,但可能有1-2个代码单元。
 * 就是说每个字符可能对应1-2个char,有的字符可能在char数组中占据两个位置。
 * 
 * <p>String类提供了处理Unicode代码点和处理Unicode代码单元的方法。
 *
 * @author  Lee Boynton
 * @author  Arthur van Hoff
 * @author  Martin Buchholz
 * @author  Ulf Zibis
 * @see     java.lang.Object#toString()
 * @see     java.lang.StringBuffer
 * @see     java.lang.StringBuilder
 * @see     java.nio.charset.Charset
 * @since   JDK1.0
 */

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence 

字段

    /** 用来保存字符的值 ,注意:这是一个char序列,而且是final的,不可更改*/
    private final char value[];

    /** 缓存字符串的hash码*/
    private int hash; // 默认为0

    /** use serialVersionUID from JDK 1.0.2 for interoperability */
    private static final long serialVersionUID = -6849794470754667710L;

    /**
     * 类字符串在序列化流协议中使用特殊的大小写。
     * 一个字符串实例根据类的序列化规范来写入ObjectOutputStream。
     */
    private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

创建string

构造器

参数为string,char

    /**
     * 初始化一个新建的String对象,从而它代表一个空的字符串序列。
     * 注意这个构造器是不需要的,因为字符串是不可变的。
     */
    public String() {
        this.value = "".value;
    }

    /**
     * 初始化一个新建的String对象,从而它代表与参数相同的字符串序列。
     * 换言而之,新建的字符串是参数字符串的拷贝。
     * 除非需要一个original的独一无二的拷贝,不需要使用这个构造器,因为字符串是不可更改的
     *
     * @param  original
     *         A {@code String}
     */
    public String(String original) {
        this.value = original.value;
        this.hash = original.hash;
    }

    /**
     * 分配一个新字符串,从而它代表的字符串序列包含了字符序列参数。
     * 字符序列的内容被拷贝了。
     * 之后对参数字符序列的修改不影响新建的字符串。
     *
     * @param  value
     *         The initial value of the string
     */
    public String(char value[]) {
    	//value值为一个复制的新数组。
        this.value = Arrays.copyOf(value, value.length);
    }

    /**
     * 分配一个新的String,它包含了参数字符数组的一个子数组。
     * offset参数是子数组的第一个字符的位置,count参数指定了子数组的长度。
     * 子数组的内容被复制。之后对字符数组的修改不影响新建的字符串。
     *
     * @param  value
     *         Array that is the source of characters
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code value} array
     */
    public String(char value[], int offset, int count) {
        if (offset < 0) {
        	//排除offset<0
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
            	//如果count<0,报错
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
            	//如果count==0而且offset<=value.length
            	//建一个空字符串
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
        	//如果offset+count>value.length,报错
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        //将value数组的一部分复制到一个新的char数组
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }

参数为代码点

    /**
     * 分配一个新的String,包含从一个Unicode代码点数组参数的子数组的字符。
     * offset参数是子数组的第一个代码点的位置,count参数指定了子数组的长度。
     * 子数组的内容被转换为char,之后对int数组的修改不影响新建的string
     *
     * @param  codePoints  Unicode代码点来源数组
     *
     * @param  offset
     *         The initial offset
     *
     * @param  count
     *         The length
     *
     * @throws  IllegalArgumentException
     *          If any invalid Unicode code point is found in {@code
     *          codePoints}  
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code count} arguments index
     *          characters outside the bounds of the {@code codePoints} array
     *
     * @since  1.5
     */
    public String(int[] codePoints, int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= codePoints.length) {
            	//如果count<=0而且offset处于正常返回,返回空字符串
                this.value = "".value;
                return;
            }
        }
        // 注意: offset 或 count 可能接近Integer.max
        if (offset > codePoints.length - count) {
        	//如果offset+count>length
            throw new StringIndexOutOfBoundsException(offset + count);
        }

        final int end = offset + count;

        // Pass 1: 计算char数组的准确大小
        int n = count; //初始大小为count
        for (int i = offset; i < end; i++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
            	//如果代码点是BMP代码点,就为一个代码单元,大小不变
                continue;
            else if (Character.isValidCodePoint(c))
            	//如果不是BMP,但是是合法的代码点,就为两个代码单元,大小+1
                n++;
            //都不是的话,说明代码点是非法的
            else throw new IllegalArgumentException(Integer.toString(c));
        }

        // Pass 2: 分配并填充char[]
        final char[] v = new char[n];

        for (int i = offset, j = 0; i < end; i++, j++) {
            int c = codePoints[i];
            if (Character.isBmpCodePoint(c))
            	//如果c是BMP代码点,直接填充
                v[j] = (char)c;
            else
            	//否则j为c的高代理代码单元,j++为低代理代码单元
                Character.toSurrogates(c, v, j++);
        }
        
        //最后value为填充完的char数组
        this.value = v;
    }

参数为byte

   /**
     * Allocates a new {@code String} constructed from a subarray of an array
     * of 8-bit integer values.
     *
     * <p> The {@code offset} argument is the index of the first byte of the
     * subarray, and the {@code count} argument specifies the length of the
     * subarray.
     *
     * <p> Each {@code byte} in the subarray is converted to a {@code char} as
     * specified in the method above.
     *
     * @deprecated This method does not properly convert bytes into characters.
     * As of JDK&nbsp;1.1, the preferred way to do this is via the
     * {@code String} constructors that take a {@link
     * java.nio.charset.Charset}, charset name, or that use the platform's
     * default charset.
     *
     * @param  ascii
     *         The bytes to be converted to characters
     *
     * @param  hibyte
     *         The top 8 bits of each 16-bit Unicode code unit
     *
     * @param  offset
     *         The initial offset
     * @param  count
     *         The length
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} or {@code count} argument is invalid
     *
     * @see  #String(byte[], int)
     * @see  #String(byte[], int, int, java.lang.String)
     * @see  #String(byte[], int, int, java.nio.charset.Charset)
     * @see  #String(byte[], int, int)
     * @see  #String(byte[], java.lang.String)
     * @see  #String(byte[], java.nio.charset.Charset)
     * @see  #String(byte[])
     */
    @Deprecated
    public String(byte ascii[], int hibyte, int offset, int count) {
        checkBounds(ascii, offset, count);
        char value[] = new char[count];

        if (hibyte == 0) {
            for (int i = count; i-- > 0;) {
                value[i] = (char)(ascii[i + offset] & 0xff);
            }
        } else {
            hibyte <<= 8;
            for (int i = count; i-- > 0;) {
                value[i] = (char)(hibyte | (ascii[i + offset] & 0xff));
            }
        }
        this.value = value;
    }

    /**
     * Allocates a new {@code String} containing characters constructed from
     * an array of 8-bit integer values. Each character <i>c</i>in the
     * resulting string is constructed from the corresponding component
     * <i>b</i> in the byte array such that:
     *
     * <blockquote><pre>
     *     <b><i>c</i></b> == (char)(((hibyte &amp; 0xff) &lt;&lt; 8)
     *                         | (<b><i>b</i></b> &amp; 0xff))
     * </pre></blockquote>
     *
     * @deprecated  This method does not properly convert bytes into
     * characters.  As of JDK&nbsp;1.1, the preferred way to do this is via the
     * {@code String} constructors that take a {@link
     * java.nio.charset.Charset}, charset name, or that use the platform's
     * default charset.
     *
     * @param  ascii
     *         The bytes to be converted to characters
     *
     * @param  hibyte
     *         The top 8 bits of each 16-bit Unicode code unit
     *
     * @see  #String(byte[], int, int, java.lang.String)
     * @see  #String(byte[], int, int, java.nio.charset.Charset)
     * @see  #String(byte[], int, int)
     * @see  #String(byte[], java.lang.String)
     * @see  #String(byte[], java.nio.charset.Charset)
     * @see  #String(byte[])
     */
    @Deprecated
    public String(byte ascii[], int hibyte) {
        this(ascii, hibyte, 0, ascii.length);
    }


    /**通过方法,检查byte数组是否能满足offset和length,
     * @param bytes
     * @param offset
     * @param length
     */
    private static void checkBounds(byte[] bytes, int offset, int length) {
        if (length < 0)
            throw new StringIndexOutOfBoundsException(length);
        if (offset < 0)
            throw new StringIndexOutOfBoundsException(offset);
        if (offset > bytes.length - length)
        	//bytes.length 要 >=offset+length
            throw new StringIndexOutOfBoundsException(offset + length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified charset.  The length of the new {@code String}
     * is a function of the charset, and hence may not be equal to the length
     * of the subarray.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode

     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length, String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null)
            throw new NullPointerException("charsetName");
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(charsetName, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the subarray.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  1.6
     */
    public String(byte bytes[], int offset, int length, Charset charset) {
        if (charset == null)
            throw new NullPointerException("charset");
        checkBounds(bytes, offset, length);
        this.value =  StringCoding.decode(charset, bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the specified {@linkplain java.nio.charset.Charset charset}.  The
     * length of the new {@code String} is a function of the charset, and hence
     * may not be equal to the length of the byte array.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], String charsetName)
            throws UnsupportedEncodingException {
    	//用指定的charsetName解码
        this(bytes, 0, bytes.length, charsetName);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of
     * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
     * The length of the new {@code String} is a function of the charset, and
     * hence may not be equal to the length of the byte array.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement string.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset charset} to be used to
     *         decode the {@code bytes}
     *
     * @since  1.6
     */
    public String(byte bytes[], Charset charset) {
        this(bytes, 0, bytes.length, charset);
    }

    /**
     * Constructs a new {@code String} by decoding the specified subarray of
     * bytes using the platform's default charset.  The length of the new
     * {@code String} is a function of the charset, and hence may not be equal
     * to the length of the subarray.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @param  offset
     *         The index of the first byte to decode
     *
     * @param  length
     *         The number of bytes to decode
     *
     * @throws  IndexOutOfBoundsException
     *          If the {@code offset} and the {@code length} arguments index
     *          characters outside the bounds of the {@code bytes} array
     *
     * @since  JDK1.1
     */
    public String(byte bytes[], int offset, int length) {
    	//检查长度
        checkBounds(bytes, offset, length);
        //先用UTF-8,报错再用ISO-8859-1解码,从byte转为char
        this.value = StringCoding.decode(bytes, offset, length);
    }

    /**
     * Constructs a new {@code String} by decoding the specified array of bytes
     * using the platform's default charset.  The length of the new {@code
     * String} is a function of the charset, and hence may not be equal to the
     * length of the byte array.
     *
     * <p> The behavior of this constructor when the given bytes are not valid
     * in the default charset is unspecified.  The {@link
     * java.nio.charset.CharsetDecoder} class should be used when more control
     * over the decoding process is required.
     *
     * @param  bytes
     *         The bytes to be decoded into characters
     *
     * @since  JDK1.1
     */
    public String(byte bytes[]) {
        this(bytes, 0, bytes.length);
    }

参数为stringbuilder和stringbuffer

   /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string buffer argument. The contents of the
     * string buffer are copied; subsequent modification of the string buffer
     * does not affect the newly created string.
     *
     * @param  buffer
     *         A {@code StringBuffer}
     */
    public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }

    /**
     * Allocates a new string that contains the sequence of characters
     * currently contained in the string builder argument. The contents of the
     * string builder are copied; subsequent modification of the string builder
     * does not affect the newly created string.
     *
     * <p> This constructor is provided to ease migration to {@code
     * StringBuilder}. Obtaining a string from a string builder via the {@code
     * toString} method is likely to run faster and is generally preferred.
     *
     * @param   builder
     *          A {@code StringBuilder}
     *
     * @since  1.5
     */
    public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }

ValueOf

这里基本都是根据参数的toString方法,最后都是new一个string

    /**
     * Returns the string representation of the {@code Object} argument.
     *
     * @param   obj   an {@code Object}.
     * @return  if the argument is {@code null}, then a string equal to
     *          {@code "null"}; otherwise, the value of
     *          {@code obj.toString()} is returned.
     * @see     java.lang.Object#toString()
     */
    public static String valueOf(Object obj) {
        return (obj == null) ? "null" : obj.toString();
    }

    /**
     * Returns the string representation of the {@code char} array
     * argument. The contents of the character array are copied; subsequent
     * modification of the character array does not affect the returned
     * string.
     *
     * @param   data     the character array.
     * @return  a {@code String} that contains the characters of the
     *          character array.
     */
    public static String valueOf(char data[]) {
        return new String(data);
    }

    /**
     * Returns the string representation of a specific subarray of the
     * {@code char} array argument.
     * <p>
     * The {@code offset} argument is the index of the first
     * character of the subarray. The {@code count} argument
     * specifies the length of the subarray. The contents of the subarray
     * are copied; subsequent modification of the character array does not
     * affect the returned string.
     *
     * @param   data     the character array.
     * @param   offset   initial offset of the subarray.
     * @param   count    length of the subarray.
     * @return  a {@code String} that contains the characters of the
     *          specified subarray of the character array.
     * @exception IndexOutOfBoundsException if {@code offset} is
     *          negative, or {@code count} is negative, or
     *          {@code offset+count} is larger than
     *          {@code data.length}.
     */
    public static String valueOf(char data[], int offset, int count) {
        return new String(data, offset, count);
    }

    /**
     * Equivalent to {@link #valueOf(char[], int, int)}.
     *
     * @param   data     the character array.
     * @param   offset   initial offset of the subarray.
     * @param   count    length of the subarray.
     * @return  a {@code String} that contains the characters of the
     *          specified subarray of the character array.
     * @exception IndexOutOfBoundsException if {@code offset} is
     *          negative, or {@code count} is negative, or
     *          {@code offset+count} is larger than
     *          {@code data.length}.
     */
    public static String copyValueOf(char data[], int offset, int count) {
        return new String(data, offset, count);
    }

    /**
     * Equivalent to {@link #valueOf(char[])}.
     *
     * @param   data   the character array.
     * @return  a {@code String} that contains the characters of the
     *          character array.
     */
    public static String copyValueOf(char data[]) {
        return new String(data);
    }

    /**
     * Returns the string representation of the {@code boolean} argument.
     *
     * @param   b   a {@code boolean}.
     * @return  if the argument is {@code true}, a string equal to
     *          {@code "true"} is returned; otherwise, a string equal to
     *          {@code "false"} is returned.
     */
    public static String valueOf(boolean b) {
        return b ? "true" : "false";
    }

    /**
     * Returns the string representation of the {@code char}
     * argument.
     *
     * @param   c   a {@code char}.
     * @return  a string of length {@code 1} containing
     *          as its single character the argument {@code c}.
     */
    public static String valueOf(char c) {
        char data[] = {c};
        return new String(data, true);
    }

    /**
     * Returns the string representation of the {@code int} argument.
     * <p>
     * The representation is exactly the one returned by the
     * {@code Integer.toString} method of one argument.
     *
     * @param   i   an {@code int}.
     * @return  a string representation of the {@code int} argument.
     * @see     java.lang.Integer#toString(int, int)
     */
    public static String valueOf(int i) {
        return Integer.toString(i);
    }

    /**
     * Returns the string representation of the {@code long} argument.
     * <p>
     * The representation is exactly the one returned by the
     * {@code Long.toString} method of one argument.
     *
     * @param   l   a {@code long}.
     * @return  a string representation of the {@code long} argument.
     * @see     java.lang.Long#toString(long)
     */
    public static String valueOf(long l) {
        return Long.toString(l);
    }

    /**
     * Returns the string representation of the {@code float} argument.
     * <p>
     * The representation is exactly the one returned by the
     * {@code Float.toString} method of one argument.
     *
     * @param   f   a {@code float}.
     * @return  a string representation of the {@code float} argument.
     * @see     java.lang.Float#toString(float)
     */
    public static String valueOf(float f) {
        return Float.toString(f);
    }

    /**
     * Returns the string representation of the {@code double} argument.
     * <p>
     * The representation is exactly the one returned by the
     * {@code Double.toString} method of one argument.
     *
     * @param   d   a {@code double}.
     * @return  a  string representation of the {@code double} argument.
     * @see     java.lang.Double#toString(double)
     */
    public static String valueOf(double d) {
        return Double.toString(d);
    }

intern

    /**
     * 返回一个对于这个string对象标准的代表。
     * <p>
     * 一个字符串的池,初始为空,有string类私人维护。
     * <p>
     * 当调用intern方法时,如果池子中已经已经含有一个与这个字符串相同的string
     * (根据equals方法),那么返回池子里的字符串。
     * 否则,这个字符串对象被加入到池子,返回这个字符串对象的引用。
     * <p>
     * 对于任意两个字符串s和t,当且仅当s.equals(t)返回true,s.intern() == t.intern()为true
     * <p>
     * 所有的字面量字符串和值为string的常量都是被interned的。
     * string字面量在java语言规范的3.10.5被定义
     * <p>注意:只有设置字面量或者对字符串intern后,才会放入池子。如果只是new一个字符串,不会放入池子
     * <p>注意:如果str="a"+"bc",编译器会自动合成,视str为字面量,池子里有"abc"。
     * 但如果str="a"+new String("bc")或者str=a+b  这种情况,编译器不会自动合成,是str为一个新的string变量,池子里没有"abc"。
     * 
     *
     * @return  a string that has the same contents as this string, but is
     *          guaranteed to be from a pool of unique strings.
     */
    public native String intern();

String的创建后的等于

package test.t05new;

public class Test1 {

	public static void main(String[] args){
		String aString="123";
		String bString="123";
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
		System.out.println("-------------------");
		aString="123";
		bString=new String("123");
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
		System.out.println("-------------------");
		aString=new String("123");
		bString=new String("123");
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
		System.out.println("-------------------");
		aString=String.valueOf(123);
		bString="123";
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
		System.out.println("-------------------");
		aString="123".intern();
		bString="123";
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
		System.out.println("-------------------");
		aString=new String("123").intern();
		bString="123";
		System.out.println(aString==bString);
		System.out.println(aString.equals(bString));
	}
}
true
true
-------------------
false
true
-------------------
false
true
-------------------
false
true
-------------------
true
true
-------------------
true
true

 可以看到new String和valueOf都是新建的String对象,"xxx"和String.intern的结果都是从string缓存区中得到的。

 

CharSequence的方法

length和isEmpty

    /**
     * 返回字符串的长度,长度与字符串中Unicode的代码单元的长度相同。
     *
     * @return  the length of the sequence of characters represented by this
     *          object.
     */
    public int length() {
        return value.length;
    }

    /**
     * 当且仅当,length()返回0时,返回true
     *
     * @return {@code true} if {@link #length()} is {@code 0}, otherwise
     * {@code false}
     *
     * @since 1.6
     */
    public boolean isEmpty() {
        return value.length == 0;
    }

charAt

    /**
     * 返回指定位置的char值。位置的范围从0到length()-1.
     * 第一个char值的索引为0,下一个为1,以此类推。
     * 如果指定的char值是一个代理(高代理或低代理),返回这个代理值。
     *
     * @param      index   the index of the {@code char} value.
     * @return     the {@code char} value at the specified index of this string.
     *             The first {@code char} value is at index {@code 0}.
     * @exception  IndexOutOfBoundsException  if the {@code index}
     *             argument is negative or not less than the length of this
     *             string.
     */
    public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }

subSequence和subString

    /**
     * 返回一个是这个字符串的子字符串的字符串。
     * 子字符串,以指定的beginIndex(包含)开始,到字符串的末尾(包含)结束。
     * 因此子字符串的length为length()-beginIndex<p>
     * 
     * Examples:
     * <blockquote><pre>
     * "unhappy".substring(2) returns "happy"
     * "Harbison".substring(3) returns "bison"
     * "emptiness".substring(9) returns "" (an empty string)
     * </pre></blockquote>
     *
     * @param      beginIndex   the beginning index, inclusive.
     * @return     the specified substring.
     * @exception  IndexOutOfBoundsException  if
     *             {@code beginIndex} is negative or larger than the
     *             length of this {@code String} object.
     */
    public String substring(int beginIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        int subLen = value.length - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        //如果begin为0,返回自己
        //否则将value中的一部分复制到一个新数组,再建立一个新的string
        return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
    }

    /**
     * 返回一个是这个字符串的子字符串的字符串。
     * 子字符串,以指定的beginIndex(包含)开始,到endIndex - 1(包含)结束。
     * 因此子字符串的length为endIndex-beginIndex
     * <p>
     * 例子
     * <blockquote><pre>
     * "hamburger".substring(4, 8) returns "urge"
     * "smiles".substring(1, 5) returns "mile"
     * </pre></blockquote>
     * 
     * @param      beginIndex   the beginning index, inclusive.
     * @param      endIndex     the ending index, exclusive.
     * @return     the specified substring.
     * @exception  IndexOutOfBoundsException  if the
     *             {@code beginIndex} is negative, or
     *             {@code endIndex} is larger than the length of
     *             this {@code String} object, or
     *             {@code beginIndex} is larger than
     *             {@code endIndex}.
     */
    public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
        	//最小为0
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
        	//最大为value.length
            throw new StringIndexOutOfBoundsException(endIndex);
        }
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
        	//end>=begin
            throw new StringIndexOutOfBoundsException(subLen);
        }
        //如果begin==0 而且end ==value.length,返回自己
        //否则将value中的一部分复制到一个新数组,再建立一个新的string
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }

    /**
     * Returns a character sequence that is a subsequence of this sequence.
     *
     * <p> An invocation of this method of the form
     *
     * <blockquote><pre>
     * str.subSequence(begin,&nbsp;end)</pre></blockquote>
     *
     * behaves in exactly the same way as the invocation
     *
     * <blockquote><pre>
     * str.substring(begin,&nbsp;end)</pre></blockquote>
     *
     * @apiNote
     * This method is defined so that the {@code String} class can implement
     * the {@link CharSequence} interface.
     *
     * @param   beginIndex   the begin index, inclusive.
     * @param   endIndex     the end index, exclusive.
     * @return  the specified subsequence.
     *
     * @throws  IndexOutOfBoundsException
     *          if {@code beginIndex} or {@code endIndex} is negative,
     *          if {@code endIndex} is greater than {@code length()},
     *          or if {@code beginIndex} is greater than {@code endIndex}
     *
     * @since 1.4
     * @spec JSR-51
     */
    public CharSequence subSequence(int beginIndex, int endIndex) {
        return this.substring(beginIndex, endIndex);
    }

比较方法

compareTo

    /**
     * 以字典序,比较两个字符串。
     * 比较基于字符串每个字符的Unicode值。
     * 如果这个字符串在字典上,在参数字符串之前,返回一个负数。
     * 如果在参数之后,返回一个整数。
     * 当字符串相同时,返回0。
     * 当equals(Object)方法返回true时,才返回0。
     * 
     * <p>
     * 下面是字典排序的定义。如果两个字符串不同,那么它们要么在一些位置上有不同的字符,或者它们的长度不同。
     * 如果它们在一个或多个位置上的字符不同,让k是这种位置的最小index,然后哪个字符串在位置k上有更小的值,
     * 这个根据小于号&lt;决定,在字典上优先于另一个字符串。
     * 这种情况下,compareTo返回两个字符串在k上的char的char,即
     * 
     * <blockquote><pre>
     * this.charAt(k)-anotherString.charAt(k)
     * </pre></blockquote>
     * 
     * 如果它们没有不同的地方,则短的字符串在字典上优先于长的字符串。
     * 这种情况下,compareTo返回两个字符串的长度差
     * <blockquote><pre>
     * this.length()-anotherString.length()
     * </pre></blockquote>
     *
     * @param   anotherString   the {@code String} to be compared.
     * @return  the value {@code 0} if the argument string is equal to
     *          this string; a value less than {@code 0} if this string
     *          is lexicographically less than the string argument; and a
     *          value greater than {@code 0} if this string is
     *          lexicographically greater than the string argument.
     */
    public int compareTo(String anotherString) {
        int len1 = value.length;
        int len2 = anotherString.value.length;
        int lim = Math.min(len1, len2);
        char v1[] = value;
        char v2[] = anotherString.value;

        int k = 0;
        while (k < lim) {
            char c1 = v1[k];
            char c2 = v2[k];
            if (c1 != c2) {
            	//第一个不同的地方,返回this.charAt(k)-anotherString.charAt(k)
                return c1 - c2;
            }
            k++;
        }
        //否则返回this.length()-anotherString.length()
        //如果返回0,一定是字符串都相同,而且长度也相同,那么就equals了
        return len1 - len2;
    }

无视大小写的比较

    /**
     * 无视大小写的comparator。这个comparator是可序列化的。
     * 注意:这个comparator不能考虑地区因素,会导致在特定地区不满意的排序。
     * java.text包提供了Collators来允许地区敏感的排序
     *
     * @see     java.text.Collator#compare(String, String)
     * @since   1.2
     */
    public static final Comparator<String> CASE_INSENSITIVE_ORDER
                                         = new CaseInsensitiveComparator();
    private static class CaseInsensitiveComparator
            implements Comparator<String>, java.io.Serializable {
        // use serialVersionUID from JDK 1.2.2 for interoperability
        private static final long serialVersionUID = 8575799808933029326L;

        public int compare(String s1, String s2) {
            int n1 = s1.length();
            int n2 = s2.length();
            int min = Math.min(n1, n2);
            for (int i = 0; i < min; i++) {
                char c1 = s1.charAt(i);
                char c2 = s2.charAt(i);
                if (c1 != c2) {
                    c1 = Character.toUpperCase(c1);
                    c2 = Character.toUpperCase(c2);                   
                    if (c1 != c2) {
                    	//先变为大写,如果不相同,变为小写
                        c1 = Character.toLowerCase(c1);
                        c2 = Character.toLowerCase(c2);
                        if (c1 != c2) {
                            // No overflow because of numeric promotion
                        	//如果小写也不同,则在小写形态下,c1 - c2
                            return c1 - c2;
                        }
                    }
                }
            }
            //都相同,返回长度差
            return n1 - n2;
        }

        /** Replaces the de-serialized object. */
        private Object readResolve() { return CASE_INSENSITIVE_ORDER; }
    }

    /**
     * 比较两个字符串,以字典序,无视大小写差异。
     * 其中通过Character.toLowerCase(Character.toUpperCase(character))来消除大小写差异。
     * 
     * <p>
     * 注意:这个comparator不能考虑地区因素,会导致在特定地区不满意的排序。
     * java.text包提供了Collators来允许地区敏感的排序
     *
     * @param   str   the {@code String} to be compared.
     * @return  a negative integer, zero, or a positive integer as the
     *          specified String is greater than, equal to, or less
     *          than this String, ignoring case considerations.
     * @see     java.text.Collator#compare(String, String)
     * @since   1.2
     */
    public int compareToIgnoreCase(String str) {
        return CASE_INSENSITIVE_ORDER.compare(this, str);
    }

基本方法

toString

    /**
     * 返回自己
     *
     * @return  the string itself.
     */
    public String toString() {
        return this;
    }

equals

    /**
     * 与指定的对象比较。当且仅当参数不为null而且是一个string,而且代表着与这个对象相同的字符序列,才返回true
     *
     * @param  anObject
     *         The object to compare this {@code String} against
     *
     * @return  {@code true} if the given object represents a {@code String}
     *          equivalent to this string, {@code false} otherwise
     *
     * @see  #compareTo(String)
     * @see  #equalsIgnoreCase(String)
     */
    public boolean equals(Object anObject) {
        if (this == anObject) {
        	//先比较引用
            return true;
        }
        if (anObject instanceof String) {
        	//是string类型
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
            	//两者的length也相同
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                    	//如果有一个char不同就返回false
                        return false;
                    i++;
                }
                //都相同了,返回true
                return true;
            }
        }
        return false;
    }

hashcode

    /**
     * 返回字符串的hashcode。字符串的hashcode以下面方式计算
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * s[i]是字符串第i个字符,n是字符串的长度,^代表取幂。
     * (空字符串的hashcode为0)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
        int h = hash;
        if (h == 0 && value.length > 0) {
        	//如果没有被初始化,而且长度>0
            char val[] = value;

            for (int i = 0; i < value.length; i++) {
            	//每次前面的hash*31+自己char
                h = 31 * h + val[i];
            }
            //给hash赋值,hash不再为0
            hash = h;
        }
        return h;
    }

得到代码点,代码点数量,代码点偏移量

    /**
     * <p> 返回指定index的代码点(character)。
     * index从0到length()-1。
     * 
     * <p> 如果指定index的char值在高代理范围内,
     * 而且后面的index<length(),而且后面的char值在低代理范围内,
     * 则返回对应这个代理对的补充代码点。
     * 否则,返回给定index的char值。
     *
     * @param      index the index to the {@code char} values
     * @return     the code point value of the character at the
     *             {@code index}
     * @exception  IndexOutOfBoundsException  if the {@code index}
     *             argument is negative or not less than the length of this
     *             string.
     * @since      1.5
     */
    public int codePointAt(int index) {
        if ((index < 0) || (index >= value.length)) {
        	//0<=index<value.length
            throw new StringIndexOutOfBoundsException(index);
        }
        /*static int codePointAtImpl(char[] a, int index, int limit) {
            char c1 = a[index];
            if (isHighSurrogate(c1) && ++index < limit) {
                //index是高代理
                char c2 = a[index];
                if (isLowSurrogate(c2)) {
                	//index++是低代理
                	//返回代理对对应的代码点 
                    return toCodePoint(c1, c2);
                }
            }
            return c1;
        }*/
        return Character.codePointAtImpl(value, index, value.length);
    }

    /**
     * <p>返回指定index之前的代码点。index范围为1到length()
     * 
     * <p>如果在index-1位置的char值在低代理范围,index-2是非负的,而且index-2的char值在高代理范围,
     * 那么返回代理对对应的补充代码点。
     * 如果index-1位置的char值不成对的低代理或高代理,返回代理值。
     *
     * @param     index the index following the code point that should be returned
     * @return    the Unicode code point value before the given index.
     * @exception IndexOutOfBoundsException if the {@code index}
     *            argument is less than 1 or greater than the length
     *            of this string.
     * @since     1.5
     */
    public int codePointBefore(int index) {
        int i = index - 1;
        if ((i < 0) || (i >= value.length)) {
        	//1<=index<=length
            throw new StringIndexOutOfBoundsException(index);
        }
        /*static int codePointBeforeImpl(char[] a, int index, int start) {
            char c2 = a[--index];
            if (isLowSurrogate(c2) && index > start) {
            	//index-1为低代理
                char c1 = a[--index];
                if (isHighSurrogate(c1)) {
                	//index-2位高代理
                 	//返回对应代码点
                    return toCodePoint(c1, c2);
                }
            }
            return c2;
        }*/
        return Character.codePointBeforeImpl(value, index, 0);
    }

    /**
     * 返回字符串,指定范围text内的代码点的数量。
     * text从beginIndex开始(包含),endIndex-1结束(包含)。
     * 因此text的长度为endIndex-beginIndex。
     * text范围内的不成对的代理,计算为1个代码点(即成对的代理,2个char,计算为1个代码点)
     * 
     * @param beginIndex the index to the first {@code char} of
     * the text range.
     * @param endIndex the index after the last {@code char} of
     * the text range.
     * @return the number of Unicode code points in the specified text
     * range
     * @exception IndexOutOfBoundsException if the
     * {@code beginIndex} is negative, or {@code endIndex}
     * is larger than the length of this {@code String}, or
     * {@code beginIndex} is larger than {@code endIndex}.
     * @since  1.5
     */
    public int codePointCount(int beginIndex, int endIndex) {
        if (beginIndex < 0 || endIndex > value.length || beginIndex > endIndex) {
            throw new IndexOutOfBoundsException();
        }
        /*static int codePointCountImpl(char[] a, int offset, int count) {
            int endIndex = offset + count;
            int n = count;
            for (int i = offset; i < endIndex; ) {
                if (isHighSurrogate(a[i++]) && i < endIndex &&
                    isLowSurrogate(a[i])) {
                    //如果是成对代理,就n--
                    n--;
                    i++;
                }
            }
            return n;
        }*/
        return Character.codePointCountImpl(value, beginIndex, endIndex - beginIndex);
    }

    /**
     * 返回在string中从给定的index,通过codePointOffset个代码点,经过的偏移量。
     * 不成对的代理,计算为1个代码点
     * 
     *
     * @param index the index to be offset
     * @param codePointOffset the offset in code points
     * @return the index within this {@code String}
     * @exception IndexOutOfBoundsException if {@code index}
     *   is negative or larger then the length of this
     *   {@code String}, or if {@code codePointOffset} is positive
     *   and the substring starting with {@code index} has fewer
     *   than {@code codePointOffset} code points,
     *   or if {@code codePointOffset} is negative and the substring
     *   before {@code index} has fewer than the absolute value
     *   of {@code codePointOffset} code points.
     * @since 1.5
     */
    public int offsetByCodePoints(int index, int codePointOffset) {
        if (index < 0 || index > value.length) {
            throw new IndexOutOfBoundsException();
        }
              
        return Character.offsetByCodePointsImpl(value, 0, value.length,
                index, codePointOffset);
    }

得到bytes数组

    /**
     * Encodes this {@code String} into a sequence of bytes using the named
     * charset, storing the result into a new byte array.
     *
     * <p> The behavior of this method when this string cannot be encoded in
     * the given charset is unspecified.  The {@link
     * java.nio.charset.CharsetEncoder} class should be used when more control
     * over the encoding process is required.
     *
     * @param  charsetName
     *         The name of a supported {@linkplain java.nio.charset.Charset
     *         charset}
     *
     * @return  The resultant byte array
     *
     * @throws  UnsupportedEncodingException
     *          If the named charset is not supported
     *
     * @since  JDK1.1
     */
    public byte[] getBytes(String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }

    /**
     * Encodes this {@code String} into a sequence of bytes using the given
     * {@linkplain java.nio.charset.Charset charset}, storing the result into a
     * new byte array.
     *
     * <p> This method always replaces malformed-input and unmappable-character
     * sequences with this charset's default replacement byte array.  The
     * {@link java.nio.charset.CharsetEncoder} class should be used when more
     * control over the encoding process is required.
     *
     * @param  charset
     *         The {@linkplain java.nio.charset.Charset} to be used to encode
     *         the {@code String}
     *
     * @return  The resultant byte array
     *
     * @since  1.6
     */
    public byte[] getBytes(Charset charset) {
        if (charset == null) throw new NullPointerException();
        return StringCoding.encode(charset, value, 0, value.length);
    }

    /**
     * 将字符串编码为一个byte数组,使用平台默认的charset(UTF-8)
     * 当默认charset没有被指定时,这个字符串不能被编码。
     * 当需要更多的控制编码过程,应该使用CharsetEncoder类。
     *
     * @return  The resultant byte array
     *
     * @since      JDK1.1
     */
    public byte[] getBytes() {
        return StringCoding.encode(value, 0, value.length);
    }

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值