POJ2774 后缀数组

Long Long Message

Description

The little cat is majoring in physics in the capital of Byterland. A piece of sad news comes to him these days: his mother is getting ill. Being worried about spending so much on railway tickets (Byterland is such a big country, and he has to spend 16 shours on train to his hometown), he decided only to send SMS with his mother.

The little cat lives in an unrich family, so he frequently comes to the mobile service center, to check how much money he has spent on SMS. Yesterday, the computer of service center was broken, and printed two very long messages. The brilliant little cat soon found out:

1. All characters in messages are lowercase Latin letters, without punctuations and spaces.
2. All SMS has been appended to each other – (i+1)-th SMS comes directly after the i-th one – that is why those two messages are quite long.
3. His own SMS has been appended together, but possibly a great many redundancy characters appear leftwards and rightwards due to the broken computer.
E.g: if his SMS is “motheriloveyou”, either long message printed by that machine, would possibly be one of “hahamotheriloveyou”, “motheriloveyoureally”, “motheriloveyouornot”, “bbbmotheriloveyouaaa”, etc.
4. For these broken issues, the little cat has printed his original text twice (so there appears two very long messages). Even though the original text remains the same in two printed messages, the redundancy characters on both sides would be possibly different.

You are given those two very long messages, and you have to output the length of the longest possible original text written by the little cat.

Background:
The SMS in Byterland mobile service are charging in dollars-per-byte. That is why the little cat is worrying about how long could the longest original text be.

Why ask you to write a program? There are four resions:
1. The little cat is so busy these days with physics lessons;
2. The little cat wants to keep what he said to his mother seceret;
3. POJ is such a great Online Judge;
4. The little cat wants to earn some money from POJ, and try to persuade his mother to see the doctor :(

Input

Two strings with lowercase letters on two of the input lines individually. Number of characters in each one will never exceed 100000.

Output

A single line with a single integer number – what is the maximum length of the original text written by the little cat.

Sample Input

yeshowmuchiloveyoumydearmotherreallyicannotbelieveit
yeaphowmuchiloveyoumydearmother

Sample Output

27
题意:求两个字符串最大公共长度。
思路:后缀数组。(摘自罗穗骞的国家集训队论文)字符串的任何一个子串都是这个字符串的某个后缀的前缀。求 A 和 B 的最长公共子串等价于求 A 的后缀和 B 的后缀的最长公共前缀的最大值。如果枚举A和 B 的所有的后缀,那么这样做显然效率低下。由于要计算 A 的后缀和 B 的后缀的最长公共前缀,所以先将第二个字符串写在第一个字符串后面,中间用一个没有出现过的字符隔开,再求这个新的字符串的后缀数组。观察一下,看看能不能从这个新的字符串的后缀数组中找到一些规律。以 A=“ aaaba ”,B=“ abaa ”为例,如图 8 所示。
 
poj <wbr>2774 <wbr>: <wbr>Long <wbr>Long <wbr>Message <wbr>(后缀数组)
    
    那么是不是所有的 height 值中的最大值就是答案呢?不一定!有可能这两个后缀是在同一个字符串中的,所以实际上只有当suffix(sa[i-1])和suffix(sa[i]) 不是同一个字符串中的两个后缀时,height[i]才是满足条件的。而这其中的最大值就是答案。记字符串 A 和字符串 B 的长度分别为|A|和|B|。求新的字符串的后缀数组和 height 数组的时间是 O(|A|+|B|) ,然后求排名相邻 但原来不在同一个字符串中的两个后缀的height值的最大值,时间也是O(|A|+|B|),所以整个做法的时间复杂度为 O(|A|+|B|) 。时间复杂度已经取到下限,由此看出,这是一个非常优秀的算法。
 
 1 /*
 2 后缀数组就是套模板求先应得数组,这题用到了两个数组,分别是sa[],height[];
 3 sa[i]表示所有后缀按字典数排序后以s[i]开始的后缀排在第i位。height[i]表示
 4 字典数为i和i-1后缀的的最长串的前缀。
 5 */
 6 
 7 #include <iostream>
 8 #include <stdio.h>
 9 #include <string.h>
10 #include <algorithm>
11 using namespace std;
12 const int MAX = 2e5+10;
13 char str[MAX];
14 int s[MAX], sa[MAX], t[MAX], t2[MAX], c[MAX], n;
15 int rank1[MAX], height[MAX];
16 
17 void build_sa(int m) {
18     int *x = t, *y = t2;
19     for(int i = 0; i < m; i ++) c[i] = 0;
20     for(int i = 0; i < n; i ++) c[x[i]=s[i]] ++;
21     for(int i = 1; i < m; i ++) c[i] += c[i-1];
22     for(int i = n-1; i >= 0; i --) sa[--c[x[i]]] = i;
23     for(int k = 1; k <= n; k <<= 1) {
24         int p = 0;
25         for(int i = n-k; i < n; i ++) y[p++] = i;
26         for(int i = 0; i < n; i ++) if(sa[i] >= k) y[p++] = sa[i] - k;
27         for(int i = 0; i < m; i ++) c[i] = 0;
28         for(int i = 0; i < n; i ++) c[x[y[i]]]++;
29         for(int i = 1; i < m; i ++) c[i] += c[i-1];
30         for(int i = n-1; i >= 0; i --) sa[--c[x[y[i]]]] = y[i];
31         swap(x,y);
32         p = 1;
33         x[sa[0]] = 0;
34         for(int i = 1; i < n; i ++)
35             x[sa[i]] = y[sa[i-1]] == y[sa[i]] && y[sa[i-1]+k] == y[sa[i]+k]?p-1:p++;
36         if(p >= n)break;
37         m = p;
38     }
39 }
40 void getHeight() {
41     int k = 0;
42     for(int i = 0; i < n; i ++) rank1[sa[i]] = i;
43     for(int i = 0; i < n; i ++) {
44         if(k) k--;
45         int j = sa[rank1[i]-1];
46         while(s[i+k] == s[j+k]) k++;
47         height[rank1[i]] = k;
48     }
49 }
50 int main() {
51     n = 0;
52     int l1, l2;
53     scanf("%s",str);
54     l1 = strlen(str);
55     for(int i = 0; i < l1; i ++) s[n++] = str[i] - 'a' + 1;
56     s[n++] = 28;
57     scanf("%s",str);
58     l2 = strlen(str);
59     for(int i = 0; i < l2; i ++) s[n++] = str[i] - 'a' + 1;
60     s[n++] = 0;
61     build_sa(30);
62     getHeight();
63     int maxn = 0;
64     for(int i = 2; i < n; i ++) {
65         if(maxn < height[i]) {
66             if(sa[i] >= 0 && sa[i] < l1 && sa[i-1] > l1)
67                 maxn = height[i];
68             if(sa[i-1] >= 0 && sa[i-1] < l1 && sa[i] > l1)
69                 maxn = height[i];
70         }
71     }
72     printf("%d\n",maxn);
73     return 0;
74 }

 

 

转载于:https://www.cnblogs.com/xingkongyihao/p/7257470.html

电动汽车数据集:2025年3K+记录 真实电动汽车数据:特斯拉、宝马、日产车型,含2025年电池规格和销售数据 关于数据集 电动汽车数据集 这个合成数据集包含许多品牌和年份的电动汽车和插电式车型的记录,捕捉技术规格、性能、定价、制造来源、销售和安全相关属性。每一行代表由vehicle_ID标识的唯一车辆列表。 关键特性 覆盖范围:全球制造商和车型组合,包括纯电动汽车和插电式混合动力汽车。 范围:电池化学成分、容量、续航里程、充电标准和速度、价格、产地、自主水平、排放、安全等级、销售和保修。 时间跨度:模型跨度多年(包括传统和即将推出的)。 数据质量说明: 某些行可能缺少某些字段(空白)。 几个分类字段包含不同的、特定于供应商的值(例如,Charging_Type、Battery_Type)。 各列中的单位混合在一起;注意kWh、km、hr、USD、g/km和额定值。 列 列类型描述示例 Vehicle_ID整数每个车辆记录的唯一标识符。1 制造商分类汽车品牌或OEM。特斯拉 型号类别特定型号名称/变体。型号Y 与记录关联的年份整数模型。2024 电池_类型分类使用的电池化学/技术。磷酸铁锂 Battery_Capacity_kWh浮充电池标称容量,单位为千瓦时。75.0 Range_km整数表示充满电后的行驶里程(公里)。505 充电类型主要充电接口或功能。CCS、NACS、CHAdeMO、DCFC、V2G、V2H、V2L Charge_Time_hr浮动充电的大致时间(小时),上下文因充电方法而异。7.5 价格_USD浮动参考车辆价格(美元).85000.00 颜色类别主要外观颜色或饰面。午夜黑 制造国_制造类别车辆制造/组装的国家。美国 Autonomous_Level浮点自动化能力级别(例如0-5),可能包括子级别的小
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值