poj 2778 DNA Sequence_e - dna sequence poj

本文链接：https://blog.youkuaiyun.com/JokerPoker/article/details/42233745

Description

It's well known that DNA Sequence is a sequence only contains A, C, T and G, and it's very useful to analyze a segment of DNA Sequence，For example, if a animal's DNA sequence contains segment ATC then it may mean that the animal may have a genetic disease. Until now scientists have found several those segments, the problem is how many kinds of DNA sequences of a species don't contain those segments.

Suppose that DNA sequences of a species is a sequence that consist of A, C, T and G，and the length of sequences is a given integer n.

Input

First line contains two integer m (0 <= m <= 10), n (1 <= n <=2000000000). Here, m is the number of genetic disease segment, and n is the length of sequences.

Next m lines each line contain a DNA genetic disease segment, and length of these segments is not larger than 10.

Output

An integer, the number of DNA sequences, mod 100000.

Sample Input

4 3

Sample Output

题目大意：给定m个病毒的DNA序列，问一串长度为n的DNA序列有多少种不包含这些病毒DNA序列的可能。（DNA序列尽由ATCG四种字符组成）。

//==============================================================

N的规模这么大，肯定没有办法朴素动归。很显然要用到矩阵的快速幂。

这题中m的规模和每个子串的长度很小，可以建立trie树再借助AC自动机来构造转移矩阵。求个快速幂就行了。

初始状态为一个1*tot的矩阵（tot指的是trie树中节点的个数，下同），其中(1,0)位置的值为1，对应trie树中的根节点，表示序列长度为零时，序列尾端不能匹配到trie树上的任意位置。

当序列长度为i时，得到的转移矩阵Ti表示当前串的尾部匹配到trie树上各节点的情况数（0节点的数值表示当前串的尾部匹配不到trie树上的任意串）

怎么构造转移矩阵？

考虑第i个节点，如果接下去加入的字符为A，并且它有一个打了标记的A儿子j（即存在病毒DNA的组成为根到j）那么就不能转移到j，同理，如果i沿着失败指针一直往上走，找到某个节点有A的儿子，就不能匹配。不然从自己开始沿失败指针走到最近的含有A的节点指向那个儿子。找不到指向0节点（即根，表示加入A匹配不到trie树中的串）。

程序中引入一个-1节点，在构造转移方程的时候会简便一些。

AC CODE

program pku_2778;

var trie:array[-1..100,1..4] of longint;

t,tt,tmp:array[0..100,0..100] of int64;

fail:array[-1..100] of longint;

q:array[1..100] of longint;

p:array[-1..100] of boolean;

tot,n,i,j:longint;

ans:int64;

//============================================================================

procedure init;

var i,j,now,m:longint;

s:string;

begin

readln(m,n);

fillchar(p,sizeof(p),0);

for i:=1 to m do

begin

readln(s); now:=0;

for j:=1 to length(s) do

begin

if p[now] then break;

if s[j]='A' then

begin

if trie[now,1]<>0 then now:=trie[now,1] else

begin

inc(tot); trie[now,1]:=tot;

now:=tot;

end;

end else

if s[j]='T' then

begin

if trie[now,2]<>0 then now:=trie[now,2] else

begin

inc(tot); trie[now,2]:=tot;

now:=tot;

end;

end else

if s[j]='C' then

begin

if trie[now,3]<>0 then now:=trie[now,3] else

begin

inc(tot); trie[now,3]:=tot;

now:=tot;

end;

end else

if s[j]='G' then

begin

if trie[now,4]<>0 then now:=trie[now,4] else

begin

inc(tot); trie[now,4]:=tot;

now:=tot;

end;

end else break;

end;

p[now]:=true;

end;

//============================================================================

procedure get_fail;

var l,r,i,j:longint;

flag:boolean;

begin

l:=1; r:=1; q[1]:=0; fail[0]:=-1; fail[-1]:=-1;

while l<=r do

begin

for i:=1 to 4 do

begin

if trie[q[l],i]<>0 then

begin j:=fail[q[l]];

inc(r); q[r]:=trie[q[l],i];

while j<>-1 do

begin

if trie[j,i]<>0 then break;

j:=fail[j];

end; fail[q[r]]:=trie[j,i];

end;

if p[q[l]] then continue;

flag:=true; j:=q[l];

while j<>-1 do

begin

if p[trie[j,i]] then

begin

flag:=false;

break;

end; j:=fail[j];

end; if not(flag) then continue;

j:=q[l]; flag:=true;

while j<>-1 do

begin

if trie[j,i]<>0 then

begin

inc(t[q[l],trie[j,i]]);

flag:=false; break;

end; j:=fail[j];

end; if flag then inc(t[q[l],0]);

end; inc(l);

end;

//============================================================================

procedure quick(x:longint);

var i,j,k:longint;

begin

if x=1 then

begin

tt:=t;

exit;

end; quick(x shr 1);

fillchar(tmp,sizeof(tmp),0);

for k:=0 to tot do

for i:=0 to tot do

for j:=0 to tot do

tmp[i,j]:=tmp[i,j]+tt[i,k]*tt[k,j] mod 100000;

tt:=tmp;

if x mod 2=1 then

begin

fillchar(tmp,sizeof(tmp),0);

for k:=0 to tot do

for i:=0 to tot do

for j:=0 to tot do

tmp[i,j]:=tmp[i,j]+tt[i,k]*t[k,j] mod 100000;

tt:=tmp;

end;

//============================================================================

begin

init;

get_fail;

quick(n);

for i:=0 to tot do

if not(p[i]) then

ans:=(ans+tt[0,i]) mod 100000;

writeln(ans);

end.

poj&nbsp;2778&nbsp;DNA&nbsp;Sequence

poj 2778 DNA Sequence