String的contains方法源码

最新推荐文章于 2024-12-30 21:51:57 发布

Sun_hxx

最新推荐文章于 2024-12-30 21:51:57 发布

阅读量748

点赞数

分类专栏： java源码

本文链接：https://blog.youkuaiyun.com/Sun_hxx/article/details/115002371

版权

java源码专栏收录该内容

9 篇文章

订阅专栏

本文探讨了Java中String类的contains方法源码，详细分析了indexOf方法的实现过程，包括关键参数的作用和搜索效率的提升策略。通过源码学习，我们可以理解如何查找子串并优化查找算法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

String的contains方法源码

public static void main(String[] args) {
    String str = "123";
    System.out.println(str.contains("12"));
}

咱们点进去看一下这个contains方法

public boolean contains(CharSequence s) {
    return indexOf(s.toString()) > -1;
}

我们发现传进去的是一个CharSequence类型的数据，对于CharSequence类型参考下面的博客
https://blog.youkuaiyun.com/yaomingyang/article/details/79253470
继续看下indexOf方法

public int indexOf(String str) {
    return indexOf(str, 0);
}

接收的是一个字符串，又调用了indexOf方法，把接收的字符串和0传入

public int indexOf(String str, int fromIndex) {
    return indexOf(value, 0, value.length,
            str.value, 0, str.value.length, fromIndex);
}

再继续看这个重载的indexOf方法

/**
     * Code shared by String and StringBuffer to do searches. The
     * source is the character array being searched, and the target
     * is the string being searched for.
     *
     * @param   source       the characters being searched.
     * @param   sourceOffset offset of the source string.
     * @param   sourceCount  count of the source string.
     * @param   target       the characters being searched for.
     * @param   targetOffset offset of the target string.
     * @param   targetCount  count of the target string.
     * @param   fromIndex    the index to begin searching from.
     */
static int indexOf(char[] source, int sourceOffset, int sourceCount,
            char[] target, int targetOffset, int targetCount,
            int fromIndex) {
        if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
        }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
        if (targetCount == 0) {
            return fromIndex;
        }

        char first = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j]
                        == target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }

在详细的看这段核心代码前，我们先思考下，如果我们来实现这个功能，我们要怎么去做，最后比较下自己的代码和他的代码，看看哪里可以改进。

public class Contain {
    public static void main(String[] args) {
        System.out.println(contain("strs", "tr"));
    }

    /**
     * @param source, target
     * @return: boolean
     * @Title:
     * @Description: 思路, 利用两层for循环, 外层是source转换的字符数组, 内层是target转换的字符数组
     * 如果匹配到原source某个字符和target第一个字符一致,记source当前位置为 a,source继续往下走,
     * 看看和target下一个是否一致,中间记录连续匹配的字符个数k,如果某次不一致,回到source中a+1的位置继续匹配,k清空
     * 直到 a = source.length(false)或者 k = source.length(true)
     * @date: 22:57 2021/3/19 0019
     * Modification History:
     * Date                Author        Description
     * -------------------------------------------*
     * 22:57 2021/3/19 0019      xx.huang       修改原因
     */
    public static boolean contain(String source, String target) {
    	if(source == null || target == null){
            return false;
        }
        //原字符串长度小于目标字符串肯定不符合
        if (source.length() < target.length()) {
            return false;
        }
        //原字符串长度和目标字符串长度一致但是不相同肯定不一致
        if (source.length() == target.length() && !source.equals(target)) {
            return false;
        }
        if (source.equals(target)) {
            return true;
        }
        char[] sourceChars = source.toCharArray();
        char[] targetChars = target.toCharArray();
        int k = 0;
        //循环遍历字符数组,找到第一个符合条件的,就继续往下匹配,否则终止目标字符数组的循环
        for (int i = 0; i < sourceChars.length; i++) {
            //每次重新开始外层循环k的值需要清0
            k = 0;
            //b的值等于当前i的值,如果b的值和原字符串长度一样说明原字符串已经读取完毕了,j每进行一次循环b的值也会增加
            for (int j = 0, b = i; j < targetChars.length && b < sourceChars.length; j++, b++) {
                if (sourceChars[b] == targetChars[j]) {
                    //使用k值记录连续相等的个数
                    k++;
                    //连续相等的个数和目标字符串长度一致说明包含
                    if (k == targetChars.length) {
                        return true;
                    }
                } else {
                    continue;
                }
            }
        }
        return false;
    }
}

思路我已经写在注释里面了。下面我们来学习源码。

public int indexOf(String str, int fromIndex) {
    return indexOf(value, 0, value.length,
            str.value, 0, str.value.length, fromIndex);
}
/**
     * Code shared by String and StringBuffer to do searches. The
     * source is the character array being searched, and the target
     * is the string being searched for.
     *
     * @param   source       the characters being searched.
     * @param   sourceOffset offset of the source string.
     * @param   sourceCount  count of the source string.
     * @param   target       the characters being searched for.
     * @param   targetOffset offset of the target string.
     * @param   targetCount  count of the target string.
     * @param   fromIndex    the index to begin searching from.
     */
static int indexOf(char[] source, int sourceOffset, int sourceCount,
            char[] target, int targetOffset, int targetCount,
            int fromIndex) {
        if (fromIndex >= sourceCount) {
            return (targetCount == 0 ? sourceCount : -1);
        }
        if (fromIndex < 0) {
            fromIndex = 0;
        }
        if (targetCount == 0) {
            return fromIndex;
        }

        char first = target[targetOffset];
        int max = sourceOffset + (sourceCount - targetCount);

        for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j]
                        == target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;
    }

先看下这几个参数的含义
source:源字符
sourceOffset:偏移量，这里是0
sourceCount:源字符长度
target:被包含字符串
targetOffset:偏移量这里也是0
targetCount:被包含字符串长度
fromIndex:这里是0,就是开始搜索的位置,因为我们的场景就是要从下标为0开始搜索
所以在这个方法里面我们主要看source，sourceCount，target，targetCount。
我们直接看到这里

char first = target[targetOffset];

first存储了要搜索的字符串的第一个字符，再继续往下看

int max = sourceOffset + (sourceCount - targetCount);

max的值在我们当前场景下就是sourceCount - targetCount，也就是两个字符串的长度之差。我们可以看到下面的for循环里面终止的条件就是i <= max,那么为什么呢，举个例子，比如source长度为10,target长度为8,如果source前面3个字符都不是target的首字符,那么后面肯定也不会匹配,就没有继续看下去的必要了,极大的提升了搜索的效率。再看for循环里面的代码

/* Look for first character. */
if (source[i] != first) {
    while (++i <= max && source[i] != first);
}

根据注释我们可以知道这段代码是从source找target的首字符的。

/* Found first character, now look at the rest of v2 */
if (i <= max) {
    int j = i + 1;
    int end = j + targetCount - 1;
    for (int k = targetOffset + 1; j < end && source[j]
            == target[k]; j++, k++);

    if (j == end) {
        /* Found whole string. */
        return i - sourceOffset;
    }
}

如果最终i<=max说明找到了首字符,那么就需要继续看首字符下面的字符是否还是匹配。

int j = i + 1;//结合下面的source[j]我们可以知道j是下一个字符的位置
int end = j + targetCount - 1;//结合下面的j == end那块的代码我们可以知道当j == end的时候就是找到了整个target,那么end的值就是source中最后一个target字符的下标。

继续看里面这个for循环

for (int k = targetOffset + 1; j < end && source[j]
            == target[k]; j++, k++);

k的值结合target[k]可以知道是target的下标，从1开始，因为我们已经找到了第一个匹配的字符了，循环的条件是j < end && source[j] == target[k]，这个for循环的意思其实很显然了，就是继续往下查看后面的字符是否都能匹配上去了。如果source[j] != target[k]那么就是某个字符对不上了，或者循环走完了j == end那就是匹配成功了。如果是字符没对上，就继续在source中找target的首字符，也就是回到外层的for循环，如果对上了方法就终止了，return i - sourceOffset;返回首字符的下标。
我自己写的方法参考源码就可以有如下改进
for (int i = 0; i < sourceChars.length; i++) => for (int i = 0; i <= sourceChars.length - targetChars.length; i++)
修改外层for循环的结束条件。
思路个人感觉基本上大同小异，但是代码上面源码确实简洁了许多，这是十分值得学习的地方，但是也是很难学好的地方，共勉。