[hihocoder1015]KMP

最新推荐文章于 2022-08-18 18:44:57 发布

原创最新推荐文章于 2022-08-18 18:44:57 发布 · 479 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#kmp #hihocoder

hihocoder 专栏收录该内容

22 篇文章

订阅专栏

本文介绍了KMP算法的基本原理和实现过程，包括模式串处理、计算next数组和模式匹配等核心步骤，并提供了完整的C++实现代码。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

这里只是对于KMP的简单定义与实现代码，具体请参照hihocoder的官网

问题简介

给定模板串(代码中定义为temp)，以及原串（代码中定义为str），求temp在str中出现多少次（可重叠）。
详情请参照hihocoder的官网，上面也有详细的KMP介绍。

实现思路

提前处理模式串，减少重复匹配的次数，时间复杂度是O(len(temp)+len(str))，具体证明待研究。
首先处理模式串，得到next，其中需要利用f暂存数据。最后利用next进行模式匹配。

1. 计算next

1.1. next以及f的定义

f[i]的定义是：
f[i] = max_j{temp[0:j]==temp[i-j:i]}
next[i]的定义是：
next[i] = max_j{temp[0:j]==temp[i-j:i] && temp[j] != temp[i]}
注：这里采用前包后不包的c++惯例，即array[i:j]包括i，且不包括j。

1.2.`f`的动态规划计算

观察f的定义，可以很容易得到f的动态规划计算方法。

为了方便理解，首先定义一个代码中并没有使用的函数int compute_j(int i,char character，char *temp)，用于计算在temp中的index=i的位置的字符为character的情况下，计算f[i]。
1. 调用方式（用于帮助理解compute_j，如果理解了请跳过）
f[0] = 0; for (int i = 1;i < temp_len+1;++i) compute(i,temp[i],temp);
2. compute_j的实现
int compute_j(int i,char character,char *temp){ if (temp[f[i-1]] == character) return temp[f[i-1]]+1; else return compute_j(f[i-1],character,temp); }

将上面的compute_j展开为while形式，就可以得到compute_next
void compute_next(int *next,int &next_len,int *f,char *temp){

next_len = strlen(temp);
f[0] = -1;
next[0] = -1;

int tmp_i;
for (int i = 1;i <= next_len;++i){
    tmp_i = f[i-1];
    while (tmp_i > -1 && temp[i-1] != temp[tmp_i])
        tmp_i = f[tmp_i];
    f[i] = tmp_i+1;
}

for (int i = 1;i <= next_len;++i){
    tmp_i = f[i];
    while (tmp_i > -1 && temp[i] == temp[tmp_i])
        tmp_i = next[tmp_i];
    next[i] = tmp_i;
}

}

2. 模式匹配函数

利用next的定义，可以很好书写匹配函数。定义两个变量cur和index，分别表示原串中所匹配的位置，以及模式串中所匹配的位置。移动cur和index就可以进行匹配。
int cal_times(int *next,int &next_len,char *temp,char *str){

int len = strlen(str);
int cur = 0,index = 0;//cur is the index of str; index is the index of temp;
int times = 0;

while (cur - index < len - next_len + 1){
    while (index > -1 && temp[index] != str[cur])
        index = next[index];
    ++index;
    ++cur;
    if (index == next_len)
        ++times;
}
return times;

}

3. 主函数

就是个控制输入输出的函数。
int main(){

int N;
int next[10002],next_len,f[10002];
char temp[10002] = "0",str[1000002] = "";

cin >> N;
for (int i = 0;i < N;++i){
    cin >> temp;
    compute_next(next,next_len,f,temp);
    cin >> str;
    cout << cal_times(next,next_len,temp,str) << endl;
}
return 0;

}