HDU2473-并查集删除操作

最新推荐文章于 2023-04-09 18:57:36 发布

原创最新推荐文章于 2023-04-09 18:57:36 发布 · 220 阅读

0 ·

CC 4.0 BY-SA版权

并查集专栏收录该内容

5 篇文章

订阅专栏

本文深入探讨了垃圾邮件过滤的算法实现，通过并查集和虚父节点的概念，有效地处理了邮件的相似性和误判问题，实现了对垃圾邮件特征的准确识别。

Junk-Mail Filter

Time Limit: 15000/8000 MS (Java/Others) Memory Limit: 32768/32768 K (Java/Others)
Total Submission(s): 11996 Accepted Submission(s): 3752

Problem Description

Recognizing junk mails is a tough task. The method used here consists of two steps:
1) Extract the common characteristics from the incoming email.
2) Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.

We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:

a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.

b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.

Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.

Input

There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 105 , 1 ≤ M ≤ 106), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.

Output

For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.

Sample Input

5 6 M 0 1 M 1 2 M 1 3 S 1 M 1 2 S 3 3 1 M 1 2 0 0

Sample Output

Case #1: 3 Case #2: 2

题意：题目意思是说有n封垃圾邮件，我们现在知道一些垃圾邮件的相似性，但是我们也会误判。问这些垃圾邮件有多少种相似性。n m表示邮件数量和我们已知的信息数，M a b表示垃圾邮件 a b 有相似性，S a 表示之前有关a的信息都是错的，要从这一堆垃圾邮件里面移除作为一个新的垃圾邮件

样例解释：第一个首先是 0 1 2 3的相似性是一样的，后面判断a信息有错，a从这堆里面移除，但是 0 2 3 还是相似的。然后1 2相似，这样又把1加入原先的一堆里面，把3移除。答案是3：（0，1，2）（3）（4）

0 2 3还是相似这样理解：0 1相似，1 2相似，就可以推出 0 2 也是相似，就算1判断错误，0 2还是有相似的地方的。

思路：用并查集来做，这里有一个删除的情况，我们就要引入一个虚父节点的概念。（从其他大佬博客学来的）每封邮件有一个盒子，盒子是邮件的父节点，盒子的父节点是盒子（我就是爸爸 — —）。把相似的垃圾邮件放进一个盒子,移除邮件就是把邮件从盒子里面拿出来然后建立一个新的盒子。这里不能放进原先的盒子，因为我们在合并两邮件的时候find父节点把盒子父节点也找了一下。然后新的盒子又开始相似合并，又移除。

如果我们移除一个点只是单纯的把这个点的父节点变换的话，我们就会发现和这个点有相似的点也会影响，说白了就是wa了

坑点：数组开大点。。。

AC代码：

#include <iostream>
#include <cstring>
#include <algorithm>
using namespace std;
const int maxn=1000000+10;            //记得开大点（1e5 超时6发的成果
long long int pre[maxn];
int index,n,m;
bool vis[maxn];
int find(int x)
{
    int i, j=x;
    while(j!=pre[j]) j=pre[j];
    while(x!=j){i=pre[x];pre[x]=j;x=i;}
    return j;
}
void Init()
{
    int i;
    index=n;
    for (i=0;i<n;i++)        //普通节点 父节点 盒子节点 
        {
            pre[i]=index++;     //index是盒子的编号，pre[i]是邮件的父节点
            pre[i+n]=i+n;       //pre[i+n]是盒子的父节点
        }
    memset(vis,false,sizeof(vis));
    return ;
}
void unite(int x,int y)
{
    int a=find(x);
    int b=find(y);
    if (a==b)
        return ;
    pre[a]=b;
    return ;
}
void del(int x)
{
    pre[x]=index;          //移除一个邮件就是新建一个盒子，把邮件放进去
    pre[index]=index++;
}
int main()
{
    char t[2];
    int i,j,k;
    int start,end;
    k=1;
    while (scanf("%d%d",&n,&m)!=EOF&&(n||m))
        {
            Init();
            for (i=1;i<=m;i++)
                {
                    scanf("%s",t);
                    if (t[0]=='M')
                        {
                            scanf("%d%d",&start,&end);
                            unite(start,end);
                        }
                    else 
                        {
                            scanf("%d",&start);
                            del(start);
                        }
                }
    /*        for (i=0;i<n;i++)
                cout<<i<<"---"<<find(i)<<endl;*/
            int ans=0;
            for (i=0;i<n;i++)
                if (vis[find(i)]==false)     //最后就找有多少个不同的盒子
                    {
            //            cout<<i<<endl;
                        vis[pre[i]]=true;
                        ans++;
                    }
            printf("Case #%d: %d\n",k++,ans);
        }
    return 0;
}