AC自动机(出现的模式串有多少)hdu3695

本文介绍了一种使用AC自动机解决计算机病毒扫描问题的方法。该方法通过为每种病毒构建模式字符串,并利用AC自动机进行高效匹配,判断程序是否被感染。文章详细解析了输入输出格式及示例,并提供了完整的AC自动机实现代码。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

F.A.Q
Hand In Hand
Online Acmers
Forum | Discuss
Statistical Charts
Problem Archive
Realtime Judge Status
Authors Ranklist
 
      C/C++/Java Exams     
ACM Steps
Go to Job
Contest LiveCast
ICPC@China
Best Coder beta
VIP | STD Contests
Virtual Contests 
    DIY | Web-DIY beta
Recent Contests

Computer Virus on Planet Pandora

Time Limit: 6000/2000 MS (Java/Others)    Memory Limit: 256000/128000 K (Java/Others)
Total Submission(s): 2828    Accepted Submission(s): 788


Problem Description
    Aliens on planet Pandora also write computer programs like us. Their programs only consist of capital letters (‘A’ to ‘Z’) which they learned from the Earth. On  
planet Pandora, hackers make computer virus, so they also have anti-virus software. Of course they learned virus scanning algorithm from the Earth. Every virus has a pattern string which consists of only capital letters. If a virus’s pattern string is a substring of a program, or the pattern string is a substring of the reverse of that program, they can say the program is infected by that virus. Give you a program and a list of virus pattern strings, please write a program to figure out how many viruses the program is infected by.
 

Input
There are multiple test cases. The first line in the input is an integer T ( T<= 10) indicating the number of test cases.

For each test case:

The first line is a integer n( 0 < n <= 250) indicating the number of virus pattern strings.

Then n lines follows, each represents a virus pattern string. Every pattern string stands for a virus. It’s guaranteed that those n pattern strings are all different so there  
are n different viruses. The length of pattern string is no more than 1,000 and a pattern string at least consists of one letter.

The last line of a test case is the program. The program may be described in a compressed format. A compressed program consists of capital letters and  
“compressors”. A “compressor” is in the following format:

[qx]

q is a number( 0 < q <= 5,000,000)and x is a capital letter. It means q consecutive letter xs in the original uncompressed program. For example, [6K] means  
‘KKKKKK’ in the original program. So, if a compressed program is like:

AB[2D]E[7K]G

It actually is ABDDEKKKKKKKG after decompressed to original format.

The length of the program is at least 1 and at most 5,100,000, no matter in the compressed format or after it is decompressed to original format.
 

Output
For each test case, print an integer K in a line meaning that the program is infected by K viruses.
 

Sample Input
      
3 2 AB DCB DACB 3 ABC CDE GHI ABCCDEFIHG 4 ABB ACDEE BBB FEEE A[2B]CD[4E]F
 

Sample Output
      
0 3 2
Hint
In the second case in the sample input, the reverse of the program is ‘GHIFEDCCBA’, and ‘GHI’ is a substring of the reverse, so the program is infected by virus ‘GHI’.


对所有的模式串建立AC自动机,然后去匹配就行了 ,TLE了好多次,要各种优化才过了,最主要的也是不明白的地方,为什么用刘汝佳的last会比fail慢,以前切实过,应该快才对啊

#include<iostream>
#include<cstdio>
#include<string>
#include<cstring>
#include<vector>
#include<cmath>
#include<queue>
#include<stack>
#include<map>
#include<set>
#include<algorithm>
using namespace std;
const int maxn=250010;
const int SIGMA_SIZE=26;
char s[5100010],str[5100010];
int N;
struct AC
{
    int ch[maxn][SIGMA_SIZE],val[maxn];
    int fail[maxn];
    int Q[maxn];
    int sz;
    void clear(){memset(ch[0],0,sizeof(ch[0]));sz=1;}
    void insert(char *s)
    {
        int u=0;
        for(int i=0;s[i];i++)
        {
            int c=s[i]-'A';
            if(!ch[u][c])
            {
                memset(ch[sz],0,sizeof(ch[sz]));
                val[sz]=0;
                ch[u][c]=sz++;
            }
            u=ch[u][c];
        }
        val[u]=1;
    }
    void getfail()
    {
        queue<int> q;
        int u=0;
        for(int i=0;i<SIGMA_SIZE;i++)
        {
            u=ch[0][i];
            if(u){fail[u]=0;q.push(u);}
        }
        while(!q.empty())
        {
            int r=q.front();q.pop();
            for(int c=0;c<SIGMA_SIZE;c++)
            {
                u=ch[r][c];
                if(!u){ch[r][c]=ch[fail[r]][c];continue;}
                q.push(u);
                int v=fail[r];
                while(v&!ch[v][c])v=fail[v];
                fail[u]=ch[v][c];
            }
        }
    }
    int find(char *s,int &n)
    {
        int u=0,ans=0;
        for(int i=0;i<n;i++)
        {
            int c=s[i]-'A';
            u=ch[u][c];
            int tmp=u;
            while(tmp&&val[tmp]!=0)
            {
                ans+=val[tmp];
                val[tmp]=0;
                tmp=fail[tmp];
            }
        }
        return ans;
    }
}ac;
int main()
{
    int T;
    scanf("%d",&T);
    while(T--)
    {
        scanf("%d",&N);
        ac.clear();
        for(int i=1;i<=N;i++)
        {
            scanf("%s",s);
            ac.insert(s);
        }
        ac.getfail();
        scanf("%s",s);
        int sum=0,cnt=0,len=strlen(s);
        for(int i=0;i<len;i++)
        {
            if(s[i]>='A'&&s[i]<='Z')
            {
                if(i>0&&(s[i-1]>='0'&&s[i-1]<='9'))
                {
                    for(int j=0;j<sum;j++)str[cnt++]=s[i];
                    sum=0;
                    continue;
                }
                str[cnt++]=s[i];
            }
            if(s[i]>='0'&&s[i]<='9')sum=sum*10+s[i]-'0';
        }
        str[cnt]='\0';
        int ans=ac.find(str,cnt);
        for(int i=0,j=cnt-1;i<j;i++,j--)swap(str[i],str[j]);
        ans+=ac.find(str,cnt);
        printf("%d\n",ans);
    }
    return 0;
}





评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值