USACO Longest Prefix最长前缀

最新推荐文章于 2025-12-03 13:58:53 发布

最新推荐文章于 2025-12-03 13:58:53 发布 · 156 阅读

文章标签：

#生物 #Pascal #J# #云计算 #数据结构

编程专栏收录该内容

23 篇文章

订阅专栏

本文介绍了一种用于寻找最长前缀匹配的算法实现，该算法适用于生物学中的序列比对问题，通过对一系列短序列（元素）进行组合判断是否能构成一个较长的序列（主序列），并找到最长的有效前缀。

Longest Prefix
IOI'96
The structure of some biological objects is represented by the sequence of their constituents denoted by uppercase letters. Biologists are interested in decomposing a long sequence into shorter ones called primitives.

We say that a sequence S can be composed from a given set of primitives P if there is a some sequence of (possibly repeated) primitives from the set whose concatenation equals S. Not necessarily all primitives need be present. For instance the sequence ABABACABAABcan be composed from the set of primitives

{A, AB, BA, CA, BBC}

The first K characters of S are the prefix of S with length K. Write a program which accepts as input a set of primitives and a sequence of constituents and then computes the length of the longest prefix that can be composed from primitives.

PROGRAM NAME: prefix
INPUT FORMAT
First, the input file contains the list (length 1..200) of primitives (length 1..10) expressed as a series of space-separated strings of upper-case characters on one or more lines. The list of primitives is terminated by a line that contains nothing more than a period (`.'). No primitive appears twice in the list. Then, the input file contains a sequence S (length 1..200,000) expressed as one or more lines, none of which exceed 76 letters in length. The "newlines" are not part of the string S.
SAMPLE INPUT (file prefix.in)
A AB BA CA BBC
.
ABABACABAABC

OUTPUT FORMAT
A single line containing an integer that is the length of the longest prefix that can be composed from the set P.
SAMPLE OUTPUT (file prefix.out)
11

描述
在生物学中，一些生物的结构是用包含其要素的大写字母序列来表示的。生物学家对于把长的序列分解成较短的（称之为元素的）序列很感兴趣。

如果一个集合 P 中的元素可以通过串联（允许重复；串联，相当于 Pascal 中的 “+” 运算符）组成一个序列 S ，那么我们认为序列 S 可以分解为 P 中的元素。并不是所有的元素都必须出现。举个例子，序列 ABABACABAAB 可以分解为下面集合中的元素：

{A, AB, BA, CA, BBC}

序列 S 的前面 K 个字符称作 S 中长度为 K 的前缀。设计一个程序，输入一个元素集合以及一个大写字母序列，计算这个序列中(由集合元素组成的)最长的前缀的长度。

格式
PROGRAM NAME: prefix

INPUT FORMAT

输入数据的开头包括 1..200 个元素（长度为 1..10 ）组成的集合，用连续的以空格分开的字符串表示。字母全部是大写，数据可能不止一行。元素集合结束的标志是一个只包含一个 “.” 的行。集合中的元素没有重复。接着是大写字母序列 S ，长度为 1..200,000 ，用一行或者多行的字符串来表示，每行不超过 76 个字符。换行符并不是序列 S 的一部分。

OUTPUT FORMAT

只有一行，输出一个整数，表示 S 能够分解成 P 中元素的最长前缀的长度。

SAMPLE INPUT (file prefix.in)
A AB BA CA BBC
.
ABABACABAABC
SAMPLE OUTPUT (file prefix.out)
11

============================ 华丽的分割线 ============================
　　前两天写出来了,, 忘记发日志了`
　　感觉用的这个方法不像是DP(对于DP我还没有特别清楚的概念..),, 其中pre变量存储所有的匹配串(即短的那个字符串.), str是住串(长的那个.), 接下来最关键的是lenth这个变量,, (感觉名字没取好), 这个主串假设分为分为a1 a2 a3 ... an, lenth[i]如果是1的话代表在a1 a2 a3 .. ai 都是能够在匹配串匹配..
　　说了一些云里雾里的话吧,, 废话不多,, 代码上:

/*
LANG: C
ID: zqy11001
PROG: prefix
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define getstr(s) scanf("%s", s)
char pre[401][11];
int m;

char lenth[200001];
char str[200000];
int n;

int main(void)
{
	int i, j, k;
	int best = 0;
	freopen("prefix.in", "r", stdin);
	freopen("prefix.out", "w", stdout);
	while(1){
		getstr(pre[m]);
		if(pre[m][0] == '.'){
			break;
		}
		m++;
	}
	while(getstr(str + n) == 1){
		n += strlen(str + n);
	}
	lenth[0] = 1;
	best = 0;
	for(i = 0; i < n; i++){
		if(lenth[i]){
			best = i;
			for(j = 0; j < m; j++){
				for(k = 0; ((i + k) < n) && (pre[j][k] != '\0') && 
					(pre[j][k] == str[i + k]); k++){
					;
				}
				if(pre[j][k] == '\0'){
					lenth[i + k] = 1;
				}
			}
		}
	}
	if (lenth[n])
		best = n;
	printf("%d\n", best);
	return 0;
}