Problem Description
Recognizing junk mails is a tough task. The method used here consists of two steps:
- Extract the common characteristics from the incoming email.
- Use a filter matching the set of common characteristics extracted to determine whether the email is a spam.
We want to extract the set of common characteristics from the N sample junk emails available at the moment, and thus having a handy data-analyzing tool would be helpful. The tool should support the following kinds of operations:
a) “M X Y”, meaning that we think that the characteristics of spam X and Y are the same. Note that the relationship defined here is transitive, so
relationships (other than the one between X and Y) need to be created if they are not present at the moment.
b) “S X”, meaning that we think spam X had been misidentified. Your tool should remove all relationships that spam X has when this command is received; after that, spam X will become an isolated node in the relationship graph.
Initially no relationships exist between any pair of the junk emails, so the number of distinct characteristics at that time is N.
Please help us keep track of any necessary information to solve our problem.
Input
There are multiple test cases in the input file.
Each test case starts with two integers, N and M (1 ≤ N ≤ 105 , 1 ≤ M ≤ 106), the number of email samples and the number of operations. M lines follow, each line is one of the two formats described above.
Two successive test cases are separated by a blank line. A case with N = 0 and M = 0 indicates the end of the input file, and should not be processed by your program.
Output
For each test case, please print a single integer, the number of distinct common characteristics, to the console. Follow the format as indicated in the sample below.
Sample Input
5 6
M 0 1
M 1 2
M 1 3
S 1
M 1 2
S 3
3 1
M 1 2
0 0
Sample Output
Case #1: 3
Case #2: 2
题意
每次有两种操作:
1、将 aaa, bbb 所在的集合合并
2、将 aaa 移出它所在的集合
分析
万恶的图论500题的作者,为啥我每次看到题就能看到做法!!!!
不过不得不说这种想法挺神奇的。
怎么将 aaa 移出集合呢?
如果 aaa 不是 父节点, 每次显然只需 f[a]=af[a] = af[a]=a一句话就行了。
如果 aaa 是父节点,我们很难维持集合中其他点的关系。
所以我们的想法就是让 aaa 一直不是父节点。我们只要每次建一个虚的点就可以了!这样操作起来简单方便快捷!
代码如下:
#include <bits/stdc++.h>
#define LL long long
#define N 2000005
using namespace std;
int f[N], t, cnt, s, v[N];
char o[3];
int find(int x){
return x == f[x]? x: f[x] = find(f[x]);
}
int main(){
int i, j, n, m, a, b;
while(scanf("%d%d", &n, &m) && n){
memset(v, 0, sizeof(v));
t++;
cnt = 0;
s = n;
for(i = 1; i <= m + n; i++) f[i] = i;
while(m--){
scanf("%s", o);
if(o[0] == 'M'){
scanf("%d%d", &a, &b);
a++, b++;
a = find(a); b = find(b);
if(a <= n && b <= n){
f[a] = f[b] = ++s;
}
else if(a > n) f[b] = a;
else if(b > n) f[a] = b;
}
else{
scanf("%d", &a);
a++;
f[a] = a;
}
}
for(i = 1; i <= n; i++){
if(!v[find(i)]){
v[find(i)] = 1;
cnt++;
}
}
printf("Case #%d: %d\n", t, cnt);
}
return 0;
}
垃圾邮件识别算法
本文介绍了一种用于识别垃圾邮件的两步法算法,包括提取共同特征及使用过滤器匹配,探讨了特征合并与孤立节点的操作,并提供了一段C++代码实现。
11万+





