Java 源码解读 - String

字符串的判别方式

判断是否为对象:== null,即判断是否为对象,自有对象能使用类方法
判断是否为空字符串:equals("")
正确顺序是先判断是否为对象,再判断是否为空字符串。

Java中有两个判断相等的方法:“==”和equals()方法,前者是根据地址来比较(基础类型根据值是否相等),只有地址相等,这两个变量(对象类型)才相等;后者比较的是变量的值,只要值相等,两者就相等。

String 常用方法

contains():使用了 indexOf() 方法判断字符串是否存在

public boolean contains(CharSequence s) {
    return indexOf(s.toString()) > -1;
}

indexOf():返回第一个与 target 匹配的字符串的首索引

  • 返回 -1:表示没有匹配的字符串或起始索引超出 source 的索引
  • 返回输入的起始索引:表示 target 字符串为空
  • 返回 source 字符串长度:表示起始索引超出 source 的索引且 target 字符串为空

简单起见,事先判断 target 是否为空,使用函数时判断返回值是否小于 0 即可。

indexOf 最终调用如下函数:

/**
 * Code shared by String and StringBuffer to do searches. The
 * source is the character array being searched, and the target
 * is the string being searched for.
 *
 * @param   source       the characters being searched.
 * @param   sourceOffset offset of the source string.
 * @param   sourceCount  count of the source string.
 * @param   target       the characters being searched for.
 * @param   targetOffset offset of the target string.
 * @param   targetCount  count of the target string.
 * @param   fromIndex    the index to begin searching from.
 */
static int indexOf(char[] source, int sourceOffset, int sourceCount,
        char[] target, int targetOffset, int targetCount,
        int fromIndex) {
    if (fromIndex >= sourceCount) {
        return (targetCount == 0 ? sourceCount : -1);
    }
    if (fromIndex < 0) {
        fromIndex = 0;
    }
    if (targetCount == 0) {
        return fromIndex;
    }

    char first = target[targetOffset];
    int max = sourceOffset + (sourceCount - targetCount);

    for (int i = sourceOffset + fromIndex; i <= max; i++) {
        /* Look for first character. */
        if (source[i] != first) {
            while (++i <= max && source[i] != first);
        }

        /* Found first character, now look at the rest of v2 */
        if (i <= max) {
            int j = i + 1;
            int end = j + targetCount - 1;	// 即 i + targetCount,目标长度的结束索引 + 1
            for (int k = targetOffset + 1; j < end && source[j]
                    == target[k]; j++, k++);

            if (j == end) {
                /* Found whole string. */
                return i - sourceOffset;
            }
        }
    }
    return -1;
}

offset 可能是使用 char[] 的存储空间构成 value[],以节省空间

The offset is the first index of the storage that is used.

lastIndexOf 将自增变成自减。

compareTo() :返回字典序大小,在两者最小长度内,如果不相等,则返回字符之差;相等则返回长度之差;返回值特征是小于 0,比参数字符串小;等于 0,相等;大于 0,比参数字符串大。

public int compareTo(String anotherString) {
    int len1 = value.length;
    int len2 = anotherString.value.length;
    int lim = Math.min(len1, len2);
    char v1[] = value;
    char v2[] = anotherString.value;

    int k = 0;
    while (k < lim) {
        char c1 = v1[k];
        char c2 = v2[k];
        if (c1 != c2) {
            return c1 - c2;
        }
        k++;
    }
    return len1 - len2;
}

length():返回 value 字节数组的长度

public int length() {
    return value.length;
}

isEmpty():返回 value 的长度是否等于 0

public boolean isEmpty() {
    return value.length == 0;
}

getChars():拷贝字符串到 dst 的 dstBegin 位置,该方法不做范围的检查

/**
 * Copy characters from this string into dst starting at dstBegin.
 * This method doesn't perform any range checking.
 */
void getChars(char dst[], int dstBegin) {
    System.arraycopy(value, 0, dst, dstBegin, value.length);
}

concat():在字符串末尾添加 str

public String concat(String str) {
    int otherLen = str.length();
    if (otherLen == 0) {
        return this;
    }
    int len = value.length;
    char buf[] = Arrays.copyOf(value, len + otherLen);
    str.getChars(buf, len);
    return new String(buf, true);
}

“cares”.concat(“s”) returns “caress”
“to”.concat(“get”).concat(“her”) returns “together”

substring():返回从 beginIndex 开始到 endIndex-1 结束的子字符串

public String substring(int beginIndex, int endIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > value.length) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    int subLen = endIndex - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
            : new String(value, beginIndex, subLen);
}

String的hash函数分析

 /**
 * Returns a hash code for this string. The hash code for a
 * {@code String} object is computed as
 * <blockquote><pre>
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 * </pre></blockquote>
 * using {@code int} arithmetic, where {@code s[i]} is the
 * <i>i</i>th character of the string, {@code n} is the length of
 * the string, and {@code ^} indicates exponentiation.
 * (The hash value of the empty string is zero.)
 *
 * @return  a hash code value for this object.
 */
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

哈希函数原理及实现:https://blog.youkuaiyun.com/unix21/article/details/8492703
原理:将2000元素映射到1000桶中,最好使用每个桶两个元素的均匀分布,模不要取2的幂(这样一般低位桶元素多),最好取素数

Java中String的hash函数分析:https://blog.youkuaiyun.com/hengyunabc/article/details/7198533
*31相当于权重,前面的字符权重更高,使得相似性大的字符在hash表中位置相邻

hash算法的数学原理是什么,如何保证尽可能少的碰撞? - 灵剑的回答 - 知乎
https://www.zhihu.com/question/20507188/answer/112646244

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值