KMP是一种改进型匹配字符串的算法 从模板串中找到对应的目标串 包括KMP函数和NEXT函数 下面以俩道水题为例介绍模板
next函数的值是如何判断的呢?这 其中涉及到真前缀 真后缀的问题
- "A"的前缀和后缀都为空集,共有元素的长度为0;
- "AB"的前缀为[A],后缀为[B],共有元素的长度为0;
- "ABC"的前缀为[A, AB],后缀为[BC, C],共有元素的长度0;
- "ABCD"的前缀为[A, AB, ABC],后缀为[BCD, CD, D],共有元素的长度为0;
- "ABCDA"的前缀为[A, AB, ABC, ABCD],后缀为[BCDA, CDA, DA, A],共有元素为"A",长度为1;
- "ABCDAB"的前缀为[A, AB, ABC, ABCD, ABCDA],后缀为[BCDAB, CDAB, DAB, AB, B],共有元素为"AB",长度为2;
- "ABCDABD"的前缀为[A, AB, ABC, ABCD, ABCDA, ABCDAB],后缀为[BCDABD, CDABD, DABD, ABD, BD, D],共有元素的长度为0。
pmt 0000120
next后移一位第一位加一个-1
-1000012
Description
Given two sequences of numbers : a[1], a[2], ...... , a[N], and b[1], b[2], ...... , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). Your task is to find a number K which make a[K] = b[1], a[K + 1] = b[2], ...... , a[K + M - 1] = b[M]. If there are more than one K exist, output the smallest one.
Input
The first line of input is a number T which indicate the number of cases. Each case contains three lines. The first line is two numbers N and M (1 <= M <= 10000, 1 <= N <= 1000000). The second line contains N integers which indicate a[1], a[2], ...... , a[N]. The third line contains M integers which indicate b[1], b[2], ...... , b[M]. All integers are in the range of [-1000000, 1000000].
Output
For each test case, you should output one line which only contain K described above. If no such K exists, output -1 instead.
Sample Input
2
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2 1 2 3 1 3
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2 1 2 3 2 1
Sample Output
6
-1
题意是给定一个文本串一个模板串 一个目标串 匹配模板串 返回值 如果匹配不上返回-1
输入的第一行是数字T,表示情况的数量。每个案例包含三行。第一行是两个数字N和M (1 <= M <= 10000, 1 <= N <= 1000000)。第二行包含N个整数,表示[1],[2],…[N]。第三行包含M个整数,表示b[1], b[2],…,b[M]。所有整数都在[-1000000,1000000]范围内。
#include<iostream>
#include<stdio.h>
#include<string.h>
using namespace std;
void makeNext(char b[],int nextn[],int m)
{
int q,k;
q=0,k=-1;
nextn[0]=-1;
while(q<m)
{
if(k==-1||b[q]==b[k])
{
q++,k++;
nextn[q]=k;
}
else
k=nextn[k];
}
}
int KMP(char b[],char a[],int m, int n, int nextn[])
{
int d=0;
int i = 0;
int j = 0;
int k;
while (i<n)
{
if (j==-1||a[i]==b[j])
{
i++;
j++;
}
else
j = nextn[j];
if(j == m)
{
d++;
j = nextn[j];
}
}
return d;
}
char a[1000010],b[10010];
int nextn[10010];
int main()
{
ios::sync_with_stdio(false);
int t,n,m,h;
cin>>t;
while(t--)
{
gets(b);
gets(a);
m=strlen(b);
n=strlen(a);
makeNext(b,nextn,m);
h=KMP(b,a,m,n,nextn);
cout<<h<<endl;
}
return 0;
}
一个文本W和一个文本T,计算W在T中出现的次数。
输入文件的第一行包含一个数字:接下来的测试用例的数量。每个测试用例的格式如下:
一行与“W”这个词,一个字符串在{ ' a ',' B ',' C ',…,“Z”},1≤|女|≤10000 W(这里| |表示字符串的长度W)。一行文本T,一个字符串在{ ' a ',' B ',' C ',…,“Z”},于W | |≤T | |≤1000000。
Input
The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:
- One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
- One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
Output
For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.
Sample Input
3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN
Sample Output
1
3
0
#include<iostream>
#include<stdio.h>
#include<string.h>
using namespace std;
void makeNext(char b[],int nextn[],int m)
{
int q,k;
q=0,k=-1;
nextn[0]=-1;
while(q<m)
{
if(k==-1||b[q]==b[k])
{
q++,k++;
nextn[q]=k;
}
else
k=nextn[k];
}
}
int KMP(char b[],char a[],int m, int n, int nextn[])
{
int d=0;
int i = 0;
int j = 0;
int k;
while (i<n)
{
if (j==-1||a[i]==b[j])
{
i++;
j++;
}
else
j = nextn[j];
if(j == m)
{
d++;
j = nextn[j];
}
}
return d;
}
char a[1000010],b[10010];
int nextn[10010];
int main()
{
//ios::sync_with_stdio(false); 开同步以后cin的回车会被gets读取进去
int t,n,m,h;
//scanf("%d",&t); 用cin 或scanf 并不影响
cin>>t;
getchar();
while(t--)
{
gets(b);
gets(a);
m=strlen(b);
n=strlen(a);
makeNext(b,nextn,m);
h=KMP(b,a,m,n,nextn);
cout<<h<<endl;
}
return 0;
}