suffix tree是一种压缩词典书Trie
1. suffix tree 的定义
- 对于长为n的字符串有n个叶子节点,除了根节点,每个内部节点至少有两个孩子(The tree has exactly n leaves numbered from 1 to n.Except for the root, every internal node has at least two children).
- 每条边用非空字符串作为标记(Each edge is labeled with a non-empty substring of S).
-一个节点的两条边的标记不能以相同的字符开始( No two edges starting out of a node can have string-labels beginning with the same character.)
-对于从根节点到叶子节点路径中的标记付串联作为字符串的一个后缀( The string obtained by concatenating all the string-labels found on the path from the root to leaf i spells out suffix S[i..n], for i from 1 to n).
给定一长度为n的字符串S=S1S2..Si..Sn,和整数i,1 <= i <= n,子串SiSi+1…Sn都是字符串S的后缀。比如、完整字符串”BANANA”的后缀子串组成的集合S分别如下:
BANANA--1
ANANA--2
NANA--3
ANA--4
NA--5
A--6
2. suffix tree links
在一个完整的suffix tree中所有的内部非根节点都有到其他内部节点的suffix link,如图中的虚线所示。
2. suffix tree应用
suffix tree能高效解决复杂的字符串编程问题