DNA Laboratory
Time Limit: 5000MS | Memory Limit: 30000K | |
Total Submissions: 2427 | Accepted: 433 |
Description
Background
Having started to build his own DNA lab just recently, the evil doctor Frankenstein is not quite up to date yet. He wants to extract his DNA, enhance it somewhat and clone himself. He has already figured out how to extract DNA from some of his blood cells, but unfortunately reading off the DNA sequence means breaking the DNA into a number of short pieces and analyzing those first. Frankenstein has not quite understood how to put the pieces together to recover the original sequence.
His pragmatic approach to the problem is to sneak into university and to kidnap a number of smart looking students. Not surprisingly, you are one of them, so you would better come up with a solution pretty fast.
Problem
You are given a list of strings over the alphabet A (for adenine), C (cytosine), G (guanine), and T (thymine),and your task is to find the shortest string (which is typically not listed) that contains all given strings as substrings.
If there are several such strings of shortest length, find the smallest in alphabetical/lexicographical order.
Having started to build his own DNA lab just recently, the evil doctor Frankenstein is not quite up to date yet. He wants to extract his DNA, enhance it somewhat and clone himself. He has already figured out how to extract DNA from some of his blood cells, but unfortunately reading off the DNA sequence means breaking the DNA into a number of short pieces and analyzing those first. Frankenstein has not quite understood how to put the pieces together to recover the original sequence.
His pragmatic approach to the problem is to sneak into university and to kidnap a number of smart looking students. Not surprisingly, you are one of them, so you would better come up with a solution pretty fast.
Problem
You are given a list of strings over the alphabet A (for adenine), C (cytosine), G (guanine), and T (thymine),and your task is to find the shortest string (which is typically not listed) that contains all given strings as substrings.
If there are several such strings of shortest length, find the smallest in alphabetical/lexicographical order.
Input
The first line contains the number of scenarios.
For each scenario, the first line contains the number n of strings with 1 <= n <= 15. Then these strings with 1 <= length <= 100 follow, one on each line, and they consist of the letters "A", "C", "G", and "T" only.
For each scenario, the first line contains the number n of strings with 1 <= n <= 15. Then these strings with 1 <= length <= 100 follow, one on each line, and they consist of the letters "A", "C", "G", and "T" only.
Output
The output for every scenario begins with a line containing "Scenario #i:", where i is the number of the scenario starting at 1. Then print a single line containing the shortest (and smallest) string as described above. Terminate the output for the scenario
with a blank line.
Sample Input
1 2 TGCACA CAT
Sample Output
Scenario #1: TGCACAT
给 n 个字符串,求一个最小的字符串,使这 n 个字符串都是它的子串。
n 很小,考虑状压dp。
#include<cstdio>
#include<iostream>
#include<cstring>
#include<algorithm>
#include<string>
using namespace std;
const int maxn = 18;
const int inf = 0x3f3f3f3f;
int T, n;
string str[maxn];
int cost[maxn][maxn]; //i接在j前面的花费
int dp[maxn][1<<maxn]; //i打头的包含子串状态为j的最小长度
string res; //结果字符串
//一系列预处理函数
void init()
{
//去掉真包含的子串
for(int i= 0; i< n; i++)
for(int j= 0; j< n; j++)
if(i != j && str[i].find(str[j]) != string::npos)
str[j] = str[i];
/*
string黑科技
string::find 在字符串中查找指定子串
若找到,返回子串第一个字符的位置
否则返回 sting::npos
*/
sort(str, str+n);
n = unique(str, str+n) - str;
/*
将一个数组中重复的元素移动到数组的末尾
返回不含重复的最后一个元素的指针
因为它只能查出相邻的重复元素,所以用之前要先排序
配合 erase() 可以完成真正的删除
*/
//计算j前面接上i的花费
for(int i= 0; i< n; i++)
for(int j= 0; j<n; j++)
if(i != j)
{
int len = min(str[i].length(), str[j].length());
for(int k= 0; k< len; k++)
if(str[i].substr(str[i].length()-k) == str[j].substr(0, k))
cost[i][j] = str[i].length() - k;
/*
string黑科技
string::substr 复制一个子串
参数1 子串首元素,默认0
参数2 子串长度,不给长度或长度过大则读到源串末尾
返回值 子串
*/
}
}
void Rebuild(int id)
{
res = str[id];
int s = ((1<<n) - 1) & ~(1<<id);
while(s)
{
string a;
int next = -1;
for(int i= 0; i< n; i++)
if((s>>i & 1) && dp[id][s|1<<id] == dp[i][s] + cost[id][i])
{
string b = str[i].substr(str[id].length() - cost[id][i], str[i].length());
if(next == -1 || a > b)
{
a = b;
next = i;
}
}
res += a;
s = s & ~(1<<next), id = next;
}
}
int main()
{
cin >> T;
for(int cas= 1; cas<= T; cas++)
{
//题目输入
cin >> n;
for(int i= 0; i< n; i++)
cin >> str[i];
if(n > 1)
{
//预处理
init();
//dp数组初始化
memset(dp, inf, sizeof(dp));
for(int i= 0; i< n; i++)
dp[i][1<<i] = str[i].length();
//状压dp
for(int s= 0; s< 1<<n; s++)
for(int j= 0; j< n; j++) if(dp[j][s] != inf && s & (1<<j))
for(int i= 0; i< n; i++) if(!(s >> i & 1))
dp[i][s|1<<i] = min(dp[i][s|1<<i], dp[j][s] + cost[i][j]);
//找最小长度
int id = 0;
for(int i= 1; i< n; i++)
if(dp[id][(1<<n)-1] > dp[i][(1<<n)-1]) id = i;
//重建答案字符串
Rebuild(id);
}
else res = str[0];
printf("Scenario #%d:\n", cas);
cout << res << endl << endl;
}
return 0;
}