POJ1795_DNA Laboratory_状压DP

本文介绍了一种解决DNA序列拼接问题的算法,通过状态压缩动态规划方法寻找最短的字符串,该字符串包含所有给定的DNA片段作为子串。文章详细解释了预处理步骤、DP状态转移方程及结果重构过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

DNA Laboratory
Time Limit: 5000MS Memory Limit: 30000K
Total Submissions: 2427 Accepted: 433

Description

Background 
Having started to build his own DNA lab just recently, the evil doctor Frankenstein is not quite up to date yet. He wants to extract his DNA, enhance it somewhat and clone himself. He has already figured out how to extract DNA from some of his blood cells, but unfortunately reading off the DNA sequence means breaking the DNA into a number of short pieces and analyzing those first. Frankenstein has not quite understood how to put the pieces together to recover the original sequence. 
His pragmatic approach to the problem is to sneak into university and to kidnap a number of smart looking students. Not surprisingly, you are one of them, so you would better come up with a solution pretty fast. 
Problem 
You are given a list of strings over the alphabet A (for adenine), C (cytosine), G (guanine), and T (thymine),and your task is to find the shortest string (which is typically not listed) that contains all given strings as substrings. 
If there are several such strings of shortest length, find the smallest in alphabetical/lexicographical order.

Input

The first line contains the number of scenarios. 
For each scenario, the first line contains the number n of strings with 1 <= n <= 15. Then these strings with 1 <= length <= 100 follow, one on each line, and they consist of the letters "A", "C", "G", and "T" only.

Output

The output for every scenario begins with a line containing "Scenario #i:", where i is the number of the scenario starting at 1. Then print a single line containing the shortest (and smallest) string as described above. Terminate the output for the scenario with a blank line.

Sample Input

1
2
TGCACA
CAT

Sample Output

Scenario #1:
TGCACAT

给 n 个字符串,求一个最小的字符串,使这 n 个字符串都是它的子串。


n 很小,考虑状压dp。


#include<cstdio>
#include<iostream>
#include<cstring>
#include<algorithm>
#include<string>

using namespace std;

const int maxn = 18;
const int inf  = 0x3f3f3f3f;

int T, n;
string str[maxn];
int cost[maxn][maxn];	//i接在j前面的花费
int dp[maxn][1<<maxn];  //i打头的包含子串状态为j的最小长度
string res;				//结果字符串

//一系列预处理函数
void init()
{
	//去掉真包含的子串
	for(int i= 0; i< n; i++)
		for(int j= 0; j< n; j++)
		if(i != j && str[i].find(str[j]) != string::npos)
			str[j] = str[i];
	/*
	string黑科技
	string::find 在字符串中查找指定子串
	若找到,返回子串第一个字符的位置
	否则返回 sting::npos
	*/

	sort(str, str+n);

	n = unique(str, str+n) - str;
	/*
	将一个数组中重复的元素移动到数组的末尾
	返回不含重复的最后一个元素的指针
	因为它只能查出相邻的重复元素,所以用之前要先排序
	配合 erase() 可以完成真正的删除
	*/

	//计算j前面接上i的花费
	for(int i= 0; i< n; i++)
		for(int j= 0; j<n; j++)
		if(i != j)
	{
		int len = min(str[i].length(), str[j].length());

		for(int k= 0; k< len; k++)
			if(str[i].substr(str[i].length()-k) == str[j].substr(0, k))
				cost[i][j] = str[i].length() - k;
		/*
		string黑科技
		string::substr 复制一个子串
		参数1 子串首元素,默认0
		参数2 子串长度,不给长度或长度过大则读到源串末尾
		返回值 子串
		*/
	}
}

void Rebuild(int id)
{
	res = str[id];
	int s = ((1<<n) - 1) & ~(1<<id);

	while(s)
	{
		string a;
		int next = -1;

		for(int i= 0; i< n; i++)
			if((s>>i & 1) && dp[id][s|1<<id] == dp[i][s] + cost[id][i])
		{
			string b = str[i].substr(str[id].length() - cost[id][i], str[i].length());

			if(next == -1 || a > b)
			{
				a = b;
				next = i;
			}
		}

		res += a;
		s = s & ~(1<<next), id = next;
	}
}

int main()
{
	cin >> T;

	for(int cas= 1; cas<= T; cas++)
	{
		//题目输入
		cin >> n;
		for(int i= 0; i< n; i++)
			cin >> str[i];

		if(n > 1)
		{
			//预处理
			init();

			//dp数组初始化
			memset(dp, inf, sizeof(dp));
			for(int i= 0; i< n; i++)
				dp[i][1<<i] = str[i].length();

			//状压dp
			for(int s= 0; s< 1<<n; s++)
				for(int j= 0; j< n; j++) if(dp[j][s] != inf && s & (1<<j))
				for(int i= 0; i< n; i++) if(!(s >> i & 1))
				dp[i][s|1<<i] = min(dp[i][s|1<<i], dp[j][s] + cost[i][j]);

			//找最小长度
			int id = 0;
			for(int i= 1; i< n; i++)
				if(dp[id][(1<<n)-1] > dp[i][(1<<n)-1]) id = i;

			//重建答案字符串
			Rebuild(id);
		}
		else res = str[0];

		printf("Scenario #%d:\n", cas);
		cout << res << endl << endl;
	}

	return 0;
}


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值