bzoj3940 [Usaco2015 Feb]Censoring

最新推荐文章于 2018-05-28 20:32:46 发布

转载最新推荐文章于 2018-05-28 20:32:46 发布 · 51 阅读

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/aziint/p/8469407.html

本文介绍了一种利用AC自动机结合栈的数据结构来实现在文本中删除特定词汇的方法。具体而言，该方法首先建立AC自动机，用于高效地在目标文本中查找和定位待删除词汇；接着使用栈来记录文本中的字符位置，当匹配到待删除词汇时，通过调整栈顶元素来移除这些词汇。该方法确保了即使删除部分词汇后新出现的待删词汇也能被正确处理。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Description

Farmer John has purchased a subscription to Good Hooveskeeping magazine for his cows, so they have plenty of material to read while waiting around in the barn during milking sessions. Unfortunately, the latest issue contains a rather inappropriate article on how to cook the perfect steak, which FJ would rather his cows not see (clearly, the magazine is in need of better editorial oversight).

FJ has taken all of the text from the magazine to create the string S of length at most 10^5 characters. He has a list of censored words \(t_1 \cdots t_N\) that he wishes to delete from \(S\) . To do so Farmer John finds the earliest occurrence of a censored word in \(S\) (having the earliest start index) and removes that instance of the word from \(S\) . He then repeats the process again, deleting the earliest occurrence of a censored word from \(S\) , repeating until there are no more occurrences of censored words in \(S\) . Note that the deletion of one censored word might create a new occurrence of a censored word that didn't exist before.

Farmer John notes that the censored words have the property that no censored word appears as a substring of another censored word. In particular this means the censored word with earliest index in \(S\) is uniquely defined.Please help \(\mathrm{FJ}\) determine the final contents of \(S\) after censoring is complete.

\(\mathrm{FJ}\) 把杂志上所有的文章摘抄了下来并把它变成了一个长度不超过 \(10^5\) 的字符串 \(S\) 。他有一个包含 \(n\) 个单词的列表，列表里的 \(n\) 个单词记为 \(t_1\cdots t_N\) 。他希望从 \(S\) 中删除这些单词。

\(\mathrm{FJ}\) 每次在S中找到最早出现的列表中的单词(最早出现指该单词的开始位置最小)，然后从 \(S\) 中删除这个单词。他重复这个操作直到 \(S\) 中没有列表里的单词为止。注意删除一个单词后可能会导致 \(S\) 中出现另一个列表中的单词

\(\mathrm{FJ}\) 注意到列表中的单词不会出现一个单词是另一个单词子串的情况，这意味着每个列表中的单词在 \(S\) 中出现的开始位置是互不相同的

请帮助 \(\mathrm{FJ}\) 完成这些操作并输出最后的 \(S\)

Input

The first line will contain \(S\) . The second line will contain \(N\) , the number of censored words. The next \(N\) lines contain the strings \(t_1 \cdots t_N\) . Each string will contain lower-case alphabet characters (in the range \(\mathrm{a}\cdots \mathrm{z}\) ), and the combined lengths of all these strings will be at most \(10^5\) .

第一行包含一个字符串 \(S\)

第二行包含一个整数 \(N\)

接下来的 \(N\) 行，每行包含一个字符串，第 \(i\) 行的字符串是 \(t_i\)

Output

The string \(S\) after all deletions are complete. It is guaranteed that \(S\) will not become empty during the deletion process.
一行，输出操作后的 \(S\)

Sample

Sample Input

begintheescapexecutionatthebreakofdawn
2
escape
execution

Sample Output

beginthatthebreakofdawn

Solution

\(\mathrm{ac}\) 自动机 + 栈。匹配到了就弹，并且修改指针的位置。

#include<bits/stdc++.h>
using namespace std;

#define N 100011
#define rep(i, a, b) for (int i = a; i <= b; i++)

int q[N], ch[N][26], fail[N], endTag[N], tot, now, n, loc[N], m, len[N], stk[N], top;
char str[N], s[N];

int main() {
    scanf("%s%d", str, &m); n = strlen(str);
    rep(i, 1, m) {
        scanf("%s", s); len[i] = strlen(s); now = 0;
        rep(j, 0, len[i] - 1) {
            if (!ch[now][s[j] - 'a']) ch[now][s[j] - 'a'] = ++tot;
            now = ch[now][s[j] - 'a'];
        }
        endTag[now] = i;
    }
    int l = 0, r = -1;
    rep(i, 0, 25) if (ch[0][i]) q[++r] = ch[0][i];
    for (; l <= r; l++) rep(i, 0, 25)
        if (!ch[q[l]][i]) { ch[q[l]][i] = ch[fail[q[l]]][i]; continue; }
        else fail[ch[q[l]][i]] = ch[fail[q[l]]][i], q[++r] = ch[q[l]][i];
    now = 0;
    rep(i, 0, n - 1) {
        stk[++top] = i;
        int t = ch[now][str[i] - 'a'];
        if (t && endTag[t]) { top -= len[endTag[t]], now = loc[stk[top]]; continue; }
        loc[i] = now = t;
    }
    rep(i, 1, top) putchar(str[stk[i]]);
    return 0;
}

转载于:https://www.cnblogs.com/aziint/p/8469407.html