标签:
A string is finite sequence of characters over a non-empty finite set Σ.
In this problem, Σ is the set of lowercase letters.
Substring, also called factor, is a consecutive sequence of characters occurrences at least once in a string.
Now your task is simple, for two given strings, find the length of the longest common substring of them.
Here common substring means a substring of two or more strings.
The input contains exactly two lines, each line consists of no more than 250000 lowercase letters, representing a string.
The length of the longest common substring. If such string doesn‘t exist, print "0" instead.
Input:
alsdfkjfjkdsal
fdjskalajfkdsla
Output:
3
题目大意:
求两个字符串的最长公共字串.输出长度.
分析:
们引入这个记号:
max:即代码中 step 变量,它表示该状态能够接受的最长的字符串长度。
min:表示该状态能够接受的最短的字符串长度。实际上等于该状态的 par 指针指向的结点的 step + 1。
max-min+1:表示该状态能够接受的不同的字符串数。
如果你很好奇这个性质为什么成立,不妨参看范浩强的博文:后缀自动机与线性构造后缀树
从后缀树的角度来思考这个问题就非常简单了.
我们先进行贪心的匹配,如果在某个点上发生失配,说明 min_x ~ max_x 之间的字符串都是不匹配的.
那么我们该如何回退呢?
这个点和parent之间的关系是: 它的parent能够接受的最长串是这个点所能接受的串的最长公共前缀.
你又非常好奇, 那么这么说吧.后缀树上parent是这个点的父节点.那么你要走到这个点,就必须经过parent.
所以我们只要沿着par边向上跳就ok了.
接下来跟KMP比较类似就不详细说了.
代码也不长:
1 #include<cstdlib> 2 #include<cstdio> 3 #include<algorithm> 4 #include<cstring> 5 using namespace std; 6 const int maxn = (int)3e5, sigma = 26; 7 char s1[maxn],s2[maxn]; 8 struct Sam{ 9 int ch[maxn][sigma],par[maxn]; 10 int stp[maxn]; 11 int sz,last; 12 void init(){ 13 memset(ch,0,sizeof(ch)); memset(par,0,sizeof(par)); memset(stp,0,sizeof(stp)); 14 sz = last = 1; 15 } 16 void ext(int c){ 17 stp[++sz] = stp[last] + 1; 18 int p = last, np = sz; 19 while(!ch[p][c]) ch[p][c] = np, p = par[p]; 20 if(p == 0) par[np] = 1; 21 else{ 22 int q = ch[p][c]; 23 if(stp[q] != stp[p] + 1){ 24 stp[++sz] = stp[p] + 1; 25 int nq = sz; 26 memcpy(ch[nq],ch[q],sizeof(ch[q])); 27 par[nq] = par[q]; 28 par[q] = par[np] = nq; 29 while(ch[p][c] == q) ch[p][c] = nq, p = par[p]; 30 } 31 else par[np] = q; 32 } 33 last = np; 34 } 35 void build(char *pt){ 36 int i; 37 init(); 38 for(i = 0; pt[i]; ++i) ext(pt[i] - ‘a‘); 39 } 40 int vis(char *pt){ 41 int x = 1,i,ret = 0,ans = 0; 42 for(i = 0; pt[i]; ++i){ 43 if(ch[x][pt[i] - ‘a‘]) 44 x = ch[x][pt[i] - ‘a‘], ans = max(ans,++ret); 45 else x = par[x],ret = stp[x],i -= (x != 0); 46 if(x == 0) x = 1; 47 } 48 return ans; 49 } 50 }sam; 51 int main() 52 { 53 freopen("substr.in","r",stdin); 54 freopen("substr.out","w",stdout); 55 scanf("%s\n",s1); 56 scanf("%s\n",s2); 57 sam.build(s1); 58 printf("%d\n",sam.vis(s2)); 59 return 0; 60 }
标签:
原文地址:http://www.cnblogs.com/Mr-ren/p/4209556.html