kuangbin带你飞 KMP专题

博客围绕KMP算法展开,包含多道相关题目。如A题将KMP模板用于数组匹配,求最小匹配下标;B题计算给定字符串在文本中的出现次数;C题计算花布条能剪出小饰条的最多数量;I题找出两个字符串中,既是一个前缀又是另一个后缀的最长字符串。

A - Number Sequence

Given two sequences of numbers : a[1], a[2], … , a[N], and b[1], b[2], … , b[M] (1 <= M <= 10000, 1 <= N <= 1000000). Your task is to find a number K which make a[K] = b[1], a[K + 1] = b[2], … , a[K + M - 1] = b[M]. If there are more than one K exist, output the smallest one.

Input

The first line of input is a number T which indicate the number of cases. Each case contains three lines. The first line is two numbers N and M (1 <= M <= 10000, 1 <= N <= 1000000). The second line contains N integers which indicate a[1], a[2], … , a[N]. The third line contains M integers which indicate b[1], b[2], … , b[M]. All integers are in the range of [-1000000, 1000000].

Output

For each test case, you should output one line which only contain K described above. If no such K exists, output -1 instead.

Sample Input

2
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2
1 2 3 1 3
13 5
1 2 1 2 3 1 2 3 1 3 2 1 2
1 2 3 2 1

Sample Output

6
-1

这道题将KMP模板中的两个字符串匹配改成了两个数组,原理还是一样的。
这道题要求输出最小的能够匹配的下标,跳出kmp循环时有两种情况:
1.能够进行匹配时,j的值为len2,此时跳出循环。
2.匹配到s1的末尾时,仍为匹配成功,此时j一定不为len2。
所以只需要在跳出循环时判断j是否为len2,即可判断是否匹配成功。

#include <stdio.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
int s1[N + 5], s2[N + 5];
int len1, len2;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;//第一个next值为-1 
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;//如果最长前缀和最长后缀相同 
        else j = Next[j];//否则回退
    }
}


void kmp()
{
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2){
            printf("%d\n",i - len2 + 1);
            break;
    	}
	}
    if(j!=len2)
    	printf("-1\n");
}


int main()
{
	int t;
	scanf("%d",&t);
	while(t--)
	{
		scanf("%d%d",&len1,&len2);
	    for(int i=0;i<len1;i++)
	    	scanf("%d",&s1[i]);
	    for(int i=0;i<len2;i++)
	    	scanf("%d",&s2[i]);
	    get_next();
	    kmp(); 
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

int a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配,f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	int t, flag;
	scanf("%d", &t);
	while (t--) {
		scanf("%d%d", &m, &n);
		for (int i = 1; i <= m; i++)
			scanf("%d", &b[i]);
		for (int i = 1; i <= n; i++)
			scanf("%d", &a[i]);
		get_next();
		kmp();
		flag = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n) {
				flag = 1;
				cout << i - n + 1 << endl;
				break;
			}
		}
		if (flag == 0)
			cout << "-1" << endl;
	}
	
	return 0;
}

B - Oulipo

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter ‘e’. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T’s is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {‘A’, ‘B’, ‘C’, …, ‘Z’} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {‘A’, ‘B’, ‘C’, …, ‘Z’}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN

Sample Output

1
3
0
这道题输入的两个字符串可能是长的在后面,短的在前面。需要把两个字符串和它们的长度进行互换。

要求输出的是能够匹配出多少个短的字符串,如果能够匹配成功,则按照原来的模板中的j=next[j]继续进行匹配,并用一个变量ans来记录匹配成功的数量。

#include <stdio.h>
#include<string.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
char s1[N + 5], s2[N + 5];
char temp[N + 5];
int len1, len2;
int ans;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;
        else j = Next[j];
    }
}


void kmp()
{
	ans=0;
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2) {
        	ans++;    
        	j=Next[j];
        }
    }
   	printf("%d\n",ans);
}


int main()
{
	int t,n;
	scanf("%d",&t);
	while(t--) 
    {
	    scanf("%s", s1);
	    scanf("%s", s2);
	    len1 = strlen(s1), len2 = strlen(s2);
	    if(len1<len2)
	    {
	    	strcpy(temp,s2);
			strcpy(s2,s1);
			strcpy(s1,temp);
			n=len2;
			len2=len1;
			len1=n;
	    }
	    get_next();
	    kmp();
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N], c[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配,f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	int t;
	cin >> t;
	while (t--) {
		scanf("%s", a + 1); // a 为较短的字符串 
		scanf("%s", b + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		if (n > m) {
			strcpy(c, a);
			strcpy(a, b);
			strcpy(b, c);
			swap(n, m);
		}
		get_next();
		kmp();
		int res = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n) {
				res++;
			}
		} 
		cout << res << endl;
	}
	
	return 0;
}

C - 剪花布条

一块花布条,里面有些图案,另有一块直接可用的小饰条,里面也有一些图案。对于给定的花布条和小饰条,计算一下能从花布条中尽可能剪出几块小饰条来呢?

Input

输入中含有一些数据,分别是成对出现的花布条和小饰条,其布条都是用可见ASCII字符表示的,可见的ASCII字符有多少个,布条的花纹也有多少种花样。花纹条和小饰条不会超过1000个字符长。如果遇见#字符,则不再进行工作。

Output

输出能从花纹布中剪出的最多小饰条个数,如果一块都没有,那就老老实实输出0,每个结果之间应换行。

Sample Input
abcde a3
aaaaaa aa

Sample Output
0
3

这道题要输出最多能够裁剪出多少个s2,所以进行匹配成功后要让j = 0,让s1, s2从头开始匹配,而不是在最相应的next[j]进行匹配

#include <stdio.h>
#include<string.h>
using namespace std;
const int N = 1e6;
int Next[N + 5];
char s1[N + 5], s2[N + 5];
char temp[N + 5];
int len1, len2;
int ans;

void get_next()
{
    int i, j;
    i = 0;
    Next[0] = j = -1;
    while(i < len2) {
        if(j == -1 || s2[i] == s2[j]) Next[++i] = ++j;
        else j = Next[j];
    }
}


void kmp()
{
	ans=0;
    int i, j;
    i = j = 0;
    while(i < len1) {
        if(j == -1 || s1[i] == s2[j]) ++i, ++j;
        else j = Next[j];
        if(j == len2) {
        	ans++;    
        	j=0;//如果匹配出了一个,那么让j重新为0,重新匹配 
        }
    }
   	printf("%d\n",ans);
}


int main()
{
	int t,n;
	while(true) 
    {
    	scanf("%s",s1);
    	if(strcmp(s1,"#")==0) break;
	    scanf("%s", s2);
	    len1 = strlen(s1), len2 = strlen(s2);
	    get_next();
	    kmp();
	}
    return 0;
}

第二次做的代码

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配,f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	while (scanf("%s", b + 1) && b[1] != '#') {
		scanf("%s", a + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		get_next();
		kmp();
		int last = 0, res = 0;
		for (int i = 1; i <= m; i++) {
			if (f[i] == n && i - n + 1 > last) {
				last = i;
				res++;
			}
		} 
		cout << res << endl;
	}
	
	return 0;
}

D - Cyclic Nacklace

链接:https://blog.youkuaiyun.com/weixin_43772166/article/details/89815466

E - Period

链接:https://blog.youkuaiyun.com/weixin_43772166/article/details/97313894

F - Power Strings

链接:https://blog.youkuaiyun.com/weixin_43772166/article/details/97397571

G - Seek the Name, Seek the Fame

链接:https://blog.youkuaiyun.com/weixin_43772166/article/details/108782141

I - Simpsons’ Hidden Talents

Homer: Marge, I just figured out a way to discover some of the talents we weren’t aware we had.
Marge: Yeah, what is it?
Homer: Take me for example. I want to find out if I have a talent in politics, OK?
Marge: OK.
Homer: So I take some politician’s name, say Clinton, and try to find the length of the longest prefix
in Clinton’s name that is a suffix in my name. That’s how close I am to being a politician like Clinton
Marge: Why on earth choose the longest prefix that is a suffix???
Homer: Well, our talents are deeply hidden within ourselves, Marge.
Marge: So how close are you?
Homer: 0!
Marge: I’m not surprised.
Homer: But you know, you must have some real math talent hidden deep in you.
Marge: How come?
Homer: Riemann and Marjorie gives 3!!!
Marge: Who the heck is Riemann?
Homer: Never mind.
Write a program that, when given strings s1 and s2, finds the longest prefix of s1 that is a suffix of s2.

Input

Input consists of two lines. The first line contains s1 and the second line contains s2. You may assume all letters are in lowercase.

Output

Output consists of a single line that contains the longest string that is a prefix of s1 and a suffix of s2, followed by the length of that prefix. If the longest such string is the empty string, then the output should be 0.
The lengths of s1 and s2 will be at most 50000.

Sample Input

clinton
homer
riemann
marjorie

Sample Output

0
rie 3

输入两个字符串a,b,找出既是a的前缀又是b的后缀的最长字符串

根据 f 数组的定义,可知 f[len(b)] 即为所求字符串的长度

#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 5; 

char a[N], b[N];
int Next[N], f[N]; // Next是字符串 a 和自己匹配,f是字符串 a 和 b 匹配 
int n, m;
// 求解 Next 数组
void get_next()
{
	Next[1] = 0;
	for (int i = 2, j = 0; i <= n; i++) {
		while (j > 0 && a[i] != a[j + 1]) j = Next[j];
		if (a[i] == a[j + 1]) j++;
		Next[i] = j;
	}
}

void kmp()
{
	for (int i = 1, j = 0; i <= m; i++) {
		while (j > 0 && (j == n || b[i] != a[j + 1])) j = Next[j];
		if (b[i] == a[j + 1]) j++;
		f[i] = j;
		// if (f[i] == n) 此时就是 A 在 B 中的某一次出现 
	}
}

int main(void)
{
	while (scanf("%s", a + 1) != EOF) {
		scanf("%s", b + 1);
		n = strlen(a + 1);
		m = strlen(b + 1);
		get_next();
		kmp();
		int len = f[m];
		if (len == 0) {
			cout << "0" << endl;
		} else {
			for (int i = 1; i <= len; i++) {
				cout << a[i];
			}
			cout << " " << len << endl;
		}
	}
	
	return 0;
}

J - Count the string

链接:https://blog.youkuaiyun.com/weixin_43772166/article/details/108784482

### 关于 kuangbin ACM 算法竞赛培训计划 #### 数论基础专题介绍 “kuangbin专题十四涵盖了数论基础知识的学习,旨在帮助参赛者掌握算法竞赛中常用的数论概念和技术。该系列不仅提供了丰富的理论讲解,还推荐了一本详细的书籍《算法竞赛中的初等数论》,这本书包含了ACM、OI以及MO所需的基础到高级的数论知识点[^1]。 #### 并查集应用实例 在另一个具体的例子中,“kuangbin”的第五个专题聚焦于并查集的应用。通过解决实际问题如病毒感染案例分析来加深理解。在这个场景下,给定一组学生及其所属的不同社团关系图,目标是从这些信息出发找出所有可能被传染的学生数目。此过程涉及到了如何高效管理和查询集合成员之间的连通性问题[^2]。 #### 搜索技巧提升指南 对于简单的搜索题目而言,在为期约两周的时间里完成了这一部分内容的学习;尽管看似容易,但对于更复杂的状况比如状态压缩或是路径重建等问题,则建议进一步加强训练以提高解题能力[^3]。 ```python def find_parent(parent, i): if parent[i] == i: return i return find_parent(parent, parent[i]) def union(parent, rank, x, y): rootX = find_parent(parent, x) rootY = find_parent(parent, y) if rootX != rootY: if rank[rootX] < rank[rootY]: parent[rootX] = rootY elif rank[rootX] > rank[rootY]: parent[rootY] = rootX else : parent[rootY] = rootX rank[rootX] += 1 # Example usage of Union-Find algorithm to solve the virus spread problem. ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值