Hdu6068 Classic Quotation(2017多校第4场)-优快云博客

本文介绍了一种通过KMP算法解决经典语录匹配问题的方法，包括如何预处理字符串以提高匹配效率，以及如何通过递推计算不同子串的匹配次数。

Classic Quotation

Time Limit: 8000/4000 MS (Java/Others) Memory Limit: 524288/524288 K (Java/Others)
Total Submission(s): 416 Accepted Submission(s): 139

Problem Description

When online chatting, we can save what somebody said to form his ''Classic Quotation''. Little Q does this, too. What's more? He even changes the original words. Formally, we can assume what somebody said as a string

S whose length is

n . He will choose a continuous substring of

S (or choose nothing), and remove it, then merge the remain parts into a complete one without changing order, marked as

S′ . For example, he might remove ''not'' from the string ''I am not SB.'', so that the new string

S′ will be ''I am SB.'', which makes it funnier.

After doing lots of such things, Little Q finds out that string

T occurs as a continuous substring of

S′ very often.

Now given strings

S and

T , Little Q has

k questions. Each question is, given

L and

R , Little Q will remove a substring so that the remain parts are

S[1..i] and

S[j..n] , what is the expected times that

T occurs as a continuous substring of

S′ if he choose every possible pair of

(i,j)(1≤i≤L,R≤j≤n) equiprobably? Your task is to find the answer

E , and report

E×L×(n−R+1) to him.

Note : When counting occurrences,

T can overlap with each other.

Input

The first line of the input contains an integer

C(1≤C≤15) , denoting the number of test cases.

In each test case, there are

3 integers

n,m,k(1≤n≤50000,1≤m≤100,1≤k≤50000) in the first line, denoting the length of

S , the length of

T and the number of questions.

In the next line, there is a string

S consists of

n lower-case English letters.

Then in the next line, there is a string

T consists of

m lower-case English letters.

In the following

k lines, there are

2 integers

L,R(1≤L<R≤n) in each line, denoting a question.

Output

For each question, print a single line containing an integer, denoting the answer.

Sample Input

  
   1
8 5 4
iamnotsb
iamsb
4 7
3 7
3 8
2 7

Sample Output

Source

2017 Multi-University Training Contest - Team 4

————————————————————————————————————

题目的意思是是给出两个字符串s,t，求s字符串截取[1,i][j,n] (1<=i<=l,r<=j<=n)后有多少个子串t;

思路：先贴一下官方题解：

首先通过KMP算法预处理出 $T$ 的 $n e x t$ 数组，设：

$pref_i$ 表示 $S$ 的前缀 $i$ 与 $T$ 进行KMP后KMP的指针到达了哪里。

$preg_i$ 表示 $S$ 的前缀 $i$ 中 $T$ 出现的次数。

$suf_{i,j}$ 表示从 $S$ 的后缀 $i$ ，从 $j$ 指针开始KMP，能匹配多少个 $T$ 。

那么前缀 $i$ 拼接上后缀 $j$ 中 $T$ 的个数为 $preg_i+suf_{j,pref_i}$ 。

令 $p r e g$ 为 $p r e g$ 的前缀和， $s u f$ 为 $s u f$ 的后缀和， $s_{i,j}$ 表示 $i$ 前面中 $p r e f$ 为 $j$ 的个数，那么对于询问 $L, R$ ：

$ans=\sum_{i=1}^L\sum_{j=R}^n preg_i+suf_{j,pref_i}=(n-R+1)preg_L+\sum_{i=0}^{m-1}s_{L,i}\times suf_{R,i}$

以上所有数组均可以使用KMP和递推求出，时间复杂度 $O ((n + k) m)$ 。

s数组可以在kmp时预处理出来，suf可以通过预处理递推出来，枚举t字符串每一位和a-z匹配算出新的next数组，用dp的方式预处理出suf

具体看代码：

#include <iostream>
#include <cstdio>
#include <cstring>
#include <string>
#include <algorithm>
#include <cmath>
#include <map>
#include <set>
#include <stack>
#include <queue>
#include <vector>
#include <bitset>

using namespace std;

#define LL long long
const int INF = 0x3f3f3f3f;
#define mod 10000007
#define mem(a,b) memset(a,b,sizeof a)

char s[50005],t[505];
int nt[505],pren[50005],suf[50005][505],sx[50005][505],mat[505][505],nt2[505][505];
void get_next(int x)
{
    nt[0]=0;
    nt[1]=0;
    for(int i=2; i<=x; i++)
    {
        int k=nt[i-1];
        while(k>0&&t[i]!=t[k+1])
            k=nt[k];
        nt[i]=t[i]==t[k+1]?k+1:0;
    }
}

int main()
{
    int m,n,q,T;
    for(scanf("%d",&T); T--;)
    {
        scanf("%d%d%d",&n,&m,&q);
        scanf("%s%s",s+1,t+1);
        get_next(m);
        memset(sx,0,sizeof sx);
        memset(pren,0,sizeof pren);
        for(int i=1,j=0; i<=n; i++)
        {
            while(j&&s[i]!=t[j+1])
                j=nt[j];
            if(s[i]==t[j+1])
                j++;
            pren[i]=pren[i-1];
            if(j==m)
                pren[i]++,j=nt[j];
            for(int kk=0; kk<m; kk++)
                sx[i][kk]=sx[i-1][kk];
            sx[i][j]++;
        }
        for(int i=1; i<=n; i++)
            pren[i]+=pren[i-1];

        memset(mat,0,sizeof mat);
        for(int i=0; i<m; i++)
            for(int j='a'; j<='z'; j++)
            {

                int k=i;
                while(k&&t[k+1]!=j)
                    k=nt[k];
                if(t[k+1]==j)
                    k++;
                if(k==m)
                    k=nt[k],mat[i][j]=1;
                nt2[i][j]=k;
            }
        memset(suf,0,sizeof suf);
        for(int i=n; i>0; i--)
        {
            for(int j=0; j<m; j++)
                suf[i][j]=mat[j][s[i]]+suf[i+1][nt2[j][s[i]]];
        }
        for(int i=n; i>0; i--)
        {
            for(int j=0; j<m; j++)
                suf[i][j]+=suf[i+1][j];
        }
        while(q--)
        {
            int l,r;
            scanf("%d%d",&l,&r);

            LL ans=1LL*(n-r+1)*pren[l];
            for(int i=0; i<m; i++)
            {
                ans+=1LL*sx[l][i]*suf[r][i];
            }
            printf("%lld\n",ans);
        }
    }
    return 0;
}