【哈希-字符串匹配+模拟栈】SCU - 4438: Censor（哈希详解哈哈哈）-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_44049850/article/details/104068002

写在前：由于哈希没有好好听讲，也没有下来看。这是排位我开的第一道题，直接string暴力T了。昨天其实就看了哈希，太浮躁，于是本来很简单的哈希愣是没看懂。（于是放弃去看爱5了23333333）今天补上。

字符串匹配_哈希

我们将'a' - 'z'字符看成1-26，一个字符串就相当于一个某合适进制的“大数”，那么它的哈希值就是对应位按权相加. 初始化: Hash[0] = 0; Hash[i] = Hash[i - 1] * base + s[i] - 'a' + 1;//所以Hash实际上是一个前缀

为了优化，于是区间的哈希值就有公式： $Hash[l, r] = Hash[r] - Hash[l - 1] * base^{r - l + 1}$

其中的base就是我们的合适进制，一般是取大质数。对于 $base^{r - l + 1}$ 我们可以预处理出来存在数组p[ ]中，对应 $p[r - l + 1] = base^{r - l + 1}$ ，那我们遍历的时候就很方便了。

为了更好地理解这个公式，举个栗子：

比如我们有十进制数211
Hash[0] = 0
Hash[1] = 2
Hash[2] = 2 * 10 + 1 = 21
Hash[3] = (2 * 10 + 1) * 10 + 1 = 211
那么Hash[2, 3] = ?
Hash[2, 3] = Hash[3] - Hash[1] * 100 = 211 - 2 * 100 = 11

Conclusion: 区间求哈希值和上面的过程类似

我们将Hash[ ]用unsighed long long，那么如果万一溢出，会自动取余 $2^{64}$ 。这就有可能会造成哈希冲突。但是概率极低。（至于如何解决哈希冲突，我个小白自然是不会了。我师父说叫我百度……等以后做题遇见再说吧。23333333）

SCU - 4438: Censor

题意：给一个敏感字符串tar，再给一个目标字符串s。要求是目标字符串中不能出现敏感字符串，一旦出现就删掉咯qwq. （删掉中间的可能前后连起来又出现了敏感字符串），所以最终我们能得到什么合法的字符串呢？字符串可以为空。
思路：首先我们就先将敏感字符串的哈希值算出来，然后我们开始遍历目标字符串，将其推入栈中。如果长度大于了敏感字符串的长度len_tar，就以当前位置为结尾字符，往前算len_tar长度的字符串的哈希值，如果和目标字符串的哈希值相等，那么就删掉该字符子串。（对应的操作就是栈顶指针的移动）
具体看代码【还有一个操作：字符串输出时，在最后一个有效字符串后面加一个'\0'，就直接输出%s即可】

#include <iostream>
#include <cstdio>
#include <cmath>
#include <string>
#include <cstring>
#include <algorithm>
#include <limits>
#include <vector>
#include <stack>
#include <queue>
#include <set>
#include <map>
#define INF 0x3f3f3f3f
#define lowbit(x) x & (-x)

#define MID (l + r ) >> 1
#define lsn rt << 1
#define rsn rt << 1 | 1
#define Lson lsn, l, mid
#define Rson rsn, mid + 1, r
#define QL Lson, ql, qr
#define QR Rson, ql, qr
#define eps  1e-6

using namespace std;
typedef long long ll;
typedef unsigned long long ull;
const int maxN = 5e6 + 5;
const ull base = 131;
ull Hash[maxN], p[maxN];
char s[maxN], tar[maxN], ans[maxN];
void pre()
{
    p[0] = 1;
    for(int i = 1; i < maxN; i ++ )
        p[i] = p[i - 1] * base;
}
int main()
{
    pre();
    while(~scanf("%s%s", tar, s))
    {
        int len_tar = strlen(tar);
        int len_s = strlen(s);
        if(len_s < len_tar)
        {
            printf("%s\n", s);
            continue;
        }
        ull tar_hash = 0;
        for(int i = 0; i < len_tar; i ++ )
            tar_hash = tar_hash * base + tar[i] - 'a' + 1;
        Hash[0] = 0;
        int top = 0;
        for(int i = 0; i < len_s; i ++ )
        {
            ans[top ++ ] = s[i];
            Hash[top] = Hash[top - 1] * base + s[i] - 'a' + 1;
            if(top >= len_tar && Hash[top] - Hash[top - len_tar] * p[len_tar] == tar_hash)
                top -= len_tar;
        }
        ans[top] = '\0';
        printf("%s\n", ans);
    }
    return 0;
}

贴三个各有优点的Hash讲解

https://www.cnblogs.com/zyf0163/p/4806951.html

https://www.cnblogs.com/Slager-Z/p/7807011.html

https://oi-wiki.org/string/hash/

KMP+模拟栈

模拟一个栈，栈中两个元素：答案字符串的位置pos，以及s[0, pos]的后缀和模式串前缀的最大匹配数num

当kmp匹配成功之后，更新栈顶top，更新模式串指针j为新栈顶对应的num然后继续匹配

这个应该卡了string的，用了string的erase函数T了；直接用栈，也T了。

#include <iostream>
#include <cstdio>
#include <cstring>
#include <cmath>
#include <stack>
#include <algorithm>
using namespace std;
typedef long long ll ;
const int maxN = 5000006;
const ll mod = 1e9 + 7;
int read(){
    int x = 0, f = 1; char ch = getchar();
    while(ch < '0' || ch > '9') { if(ch == '-') f = -f; ch = getchar(); }
    while(ch >= '0' && ch <= '9') { x = x * 10 + ch - '0'; ch = getchar(); }
    return x * f;
}
int Next[maxN], match[maxN];
char ans[maxN];
int cnt;
void GETNext(string t, int len) {
    int i = 0, j = -1;
    Next[0] = -1;
    while(i <= len) {
        if(j == -1 || t[i] == t[j]) {
            ++ i; ++ j;
            if(t[i] != t[j]) Next[i] = j;
            else Next[i] = Next[j];
        } else {
            j = Next[j];
        }
    }
}
void kmp(string s, int sL, string t, int tL) {
    int i = 0, j = 0;
    GETNext(t, tL);
    int top = 0;
    while(i < sL && j < tL) {
        if(j == -1 || s[i] == t[j]) {
            match[top] = ++ j;
            ans[top ++ ] = s[i ++];
        }
        else j = Next[j];
        if(j == tL) {
            ++ cnt;
            top -= tL;
            j = match[top - 1];
        }
    }
    ans[top] = '\0';
}
int main() {
    string t, s;
    while(cin >> t >> s) {
        cnt = 0;
        memset(ans, 0, sizeof(ans));
        kmp(s, s.length(), t, t.length());
        cout << ans << endl;
    }
    return 0;
}
/*
abc
baabcbc
c
ccc
abcdefg
ab

abc
aababccdaabcbc
 */