Given a set of n DNA samples, where each sample is a string containing characters from {A, C, G, T}, we are trying to find a subset of samples in the set, where the length of the longest common prefix multiplied by the number of samples in that subset is maximum.
To be specific, let the samples be:
ACGT
ACGTGCGT
ACCGTGC
ACGCCGT
If we take the subset {ACGT} then the result is 4 (4 * 1), if we take {ACGT, ACGTGCGT, ACGCCGT} then the result is 3 * 3 = 9 (since ACG is the common prefix), if we take {ACGT, ACGTGCGT, ACCGTGC, ACGCCGT} then the result is 2 * 4 = 8.
Now your task is to report the maximum result we can get from the samples.
Input starts with an integer T (≤ 10), denoting the number of test cases.
Each case starts with a line containing an integer n (1 ≤ n ≤ 50000) denoting the number of DNA samples. Each of the next n lines contains a non empty string whose length is not greater than 50. And the strings contain characters from {A, C, G, T}.
For each case, print the case number and the maximum result that can be obtained.
3
4
ACGT
ACGTGCGT
ACCGTGC
ACGCCGT
3
CGCGCGCGCGCGCCCCGCCCGCGC
CGCGCGCGCGCGCCCCGCCCGCAC
CGCGCGCGCGCGCCCCGCCCGCTC
2
CGCGCCGCGCGCGCGCGCGC
GGCGCCGCGCGCGCGCGCTC
Case 1: 9
Case 2: 66
Case 3: 20
题意:给出几串DNA序列(只包含“ACGT”),求最长的前缀和(相同前缀所包含的字符个数乘字符串个数);
通过这个题引入了trie树(字典树)的概念;(以下来自百度百科)
1.定义:又称单词查找树,Tire树,是一种树形结构,是一种哈希树的变种。典型应用是用于统计,排序和保存大量的字符串(但不仅限于字符串),所以经常被搜索引擎系统用于文本词频统计。它的优点是:利用字符串的公共前缀来减少查询时间,最大限度地减少无谓的字符串比较,查询效率比哈希树高。
2.性质:
(1)根节点不包含字符,除根节点外每一个节点都只包含一个字符;
(2)从根节点到某一节点,路径上经过的字符连接起来,为该节点对应的字符串;
(3)每个节点的所有子节点包含的字符都不相同;
(1)串的快速检索
#include <iostream>
#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <iostream>
#define MAX 4
using namespace std;
struct Trie{
Trie *next[MAX];
int cnt;
}*root;
int ans;
/****new一个新指针,开辟空间****/
Trie *newTrie(){
Trie *temp = new Trie;
for(int i=0; i<MAX; i++)
temp->next[i]=NULL;
temp->cnt=0;
return temp;
}
/****释放空间,避免超内存****/
void freedom(Trie *p){
for(int i=1; i<MAX; i++){
if(p->next[i]!=NULL)
freedom(p->next[i]);
}
delete(p);
}
/****建立trie树****/
void SetTrie(string s){
Trie *p=root;
int t;
int len=s.size();
int id;
for(int i=0; i<len; i++){
switch(s[i]){
case 'A': id=0; break;
case 'C': id=1; break;
case 'G': id=2; break;
case 'T': id=3; break;
}
if(p->next[id]==NULL){
p->next[id]=newTrie();
}
p=p->next[id];
p->cnt++;
t=(i+1)*p->cnt;
ans=max(ans, t);
}
}
int main(){
int T;
cin >> T;
int Case=0;
while(T--){
Case++;
root=newTrie();
int n;
cin >> n;
string s;
ans=0;
while(n--){
cin >> s;
SetTrie(s);
}
printf("Case %d: %d\n", Case, ans);
freedom(root);
}
return 0;
}