KMP算法的简单实现

最新推荐文章于 2025-06-23 23:25:01 发布

最新推荐文章于 2025-06-23 23:25:01 发布 · 108 阅读

文章标签：

#算法 #J# #junit #REST #JDK

Java 专栏收录该内容

25 篇文章

订阅专栏

本文探讨了Java中String类indexOf方法的简单匹配算法，并详细介绍了更高效的KMP模式匹配算法。通过对比两种算法，深入解析了KMP算法的原理，并提供了实际的Java实现代码。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一直把精力放在web应用的开发和框架学习，应用，架构的领悟等等这些几乎见不到算法存在的场景中，对算法这个‘内功’修炼一直有种没处下牙的尴尬境地。不过这不代表从此不再接触算法，一味的只去掌握JDK封装好的API库。今天使用String的indexOf(...)的时候突然想看看这个方法的实现，于是...
1.6.0.17的算法实现主要代码：

for (int i = sourceOffset + fromIndex; i <= max; i++) {
            /* Look for first character. */
            if (source[i] != first) {
                while (++i <= max && source[i] != first);
            }

            /* Found first character, now look at the rest of v2 */
            if (i <= max) {
                int j = i + 1;
                int end = j + targetCount - 1;
                for (int k = targetOffset + 1; j < end && source[j] ==
                         target[k]; j++, k++);

                if (j == end) {
                    /* Found whole string. */
                    return i - sourceOffset;
                }
            }
        }
        return -1;

这个算法还是使用了简单匹配，首先找到第一个匹配的字符，然后在当前之后创建新的指针指向第二个字符，然后匹配剩余的。这个算法其实相当于i指针在匹配到不相等的时候回缩指针。所以时间复杂度是两串长度之积。不知是不是觉得不会有较长的字符串的匹配，所以不去优化，还是我把算法理解错了，呵呵。
既然我觉得这个算法不怎么好，对于较长字符串的匹配效率不怎么样，那就不得不看数据结构中被大加赞赏的KMP模式匹配算法了。具体的算法原理就不弄斧了，优秀的牛人们早就在网上留下了过分多的优秀讲解。我这里把我理解原理后的Java实现代码分享一下：

public class KmpTest {

	/**The Next() function of The KMP algorithm.
	 * @param chars
	 * @return
	 */
	private int[] createNext(char[] chars) {
		int len = chars.length;
		int[] nextArr = new int[len];
		int i = 0, j = -1;
		while (i < len - 1) {
			if ((j == -1) || (chars[j] == chars[i])) {
				i++;
				j++;
				if (chars[j] != chars[i]) {
					nextArr[i] = j + 1;
				} else {
					nextArr[i] = nextArr[j];
				}
			} else {
				j = nextArr[j] - 1;
			}
		}
		return nextArr;
	}

	/**The default method which can find the index of a pattern String in a specific String.
	 * Pattern from the first character of the specific String.
	 * @param str
	 * @param pattern
	 * @return
	 */
	public int kmpIndex(String str, String pattern) {
		return kmpIndex(str, pattern, 0);
	}

	/**Pattern from a specific postion of the specific String.
	 * @param str
	 * @param pattern
	 * @param pos
	 * @return
	 */
	public int kmpIndex(String str, String pattern, int pos) {
		char[] strChars = str.toCharArray();
		char[] ptenChars = pattern.toCharArray();
		int len = strChars.length;
		int[] next = createNext(ptenChars);
		int i = pos - 1, j = -1;
		while (i < len && j < next.length) {
			if ((j == -1) || (strChars[i] == ptenChars[j])) {
				i++;
				j++;
			} else {
				j = next[j] - 1;
			}
		}
		if (j > next.length - 1) {
			return i - next.length;
		} else {
			return -1;
		}
	}

}

再有就是一个简单的test case.

import junit.framework.Assert;
import junit.framework.TestCase;

public class KmpTestCase extends TestCase {

	KmpTest kmpTest;

	@Override
	protected void setUp() throws Exception {
		kmpTest = new KmpTest();
	}

	public void testKmpIndex() {
		Assert.assertEquals(0, kmpTest.kmpIndex("ababaababc", "ababa"));
		Assert.assertEquals(2, kmpTest.kmpIndex("ababaababc", "abaab"));
		Assert.assertEquals(6, kmpTest.kmpIndex("ababaababc", "babc"));
		Assert.assertEquals(-1, kmpTest.kmpIndex("ababaababc", "baabc"));

		Assert.assertEquals(2, kmpTest.kmpIndex("ababaababc", "abaa", 2));
		Assert.assertEquals(-1, kmpTest.kmpIndex("ababaababc", "abaa", 3));
	}

}