标签:16px query sam ttl file %s cte oid c++
Little jay really hates to deal with string. But moondy likes it very much, and she‘s so mischievous that she often gives jay some dull problems related to string. And one day, moondy gave jay another problem, poor jay finally broke out and cried, " Who can help me? I‘ll bg him! "
So what is the problem this time?
First, moondy gave jay a very long string A. Then she gave him a sequence of very short substrings, and asked him to find how many times each substring appeared in string A. What‘s more, she would denote whether or not founded appearances of this substring are allowed to overlap.
At first, jay just read string A from begin to end to search all appearances of each given substring. But he soon felt exhausted and couldn‘t go on any more, so he gave up and broke out this time.
I know you‘re a good guy and will help with jay even without bg, won‘t you?
Input
Input consists of multiple cases( <= 20 ) and terminates with end of file.
For each case, the first line contains string A ( length <= 10^5 ). The second line contains an integer N ( N <= 10^5 ), which denotes the number of queries. The next N lines, each with an integer type and a string a ( length <= 6 ), type = 0 denotes substring a is allowed to overlap and type = 1 denotes not. Note that all input characters are lowercase.
There is a blank line between two consecutive cases.
Output
For each case, output the case number first ( based on 1 , see Samples ).
Then for each query, output an integer in a single line denoting the maximum times you can find the substring under certain rules.
Output an empty line after each case.
Sample Input
ab
2
0 ab
1 ab
abababac
2
0 aba
1 aba
abcdefghijklmnopqrstuvwxyz
3
0 abc
1 def
1 jmn
Sample Output
Case 1
1
1
Case 2
3
2
Case 3
1
1
0
Hint
In Case 2,you can find the first substring starting in position (indexed from 0) 0,2,4, since they‘re allowed to overlap. The second substring starts in position 0 and 4, since they‘re not allowed to overlap.
For C++ users, kindly use scanf to avoid TLE for huge inputs.
题意:多组数据,首先一串母串,下面是n个模式串,统计各模式串在母串中出现的个数,0表示模式串可交叉,1表示不可以交叉。
思路:0的时候就是普通ac自动机,1的时候有点麻烦。。我干啥用指针。。
还是统计重复模式串的ans数组,这次得开二维,因为相同的模式串可能有两种情况。在不可交叉的情况下,我们发现一个模式串能再次被统计仅当 本次查找成功的位置-它上次的末尾位置≥查找结束时该字母在字典树的深度。
其实就是判交叉。。比如ababa(下标从0开始)找aba,第二次查到4处,上次末尾在2,深度3,4-2<3,交叉。
给每个节点一个编号id用来区分,于是可以用pos数组记录每个模式串的结尾区分模式串,方便最后统计ans,因为这次每个模式串可以有0和1两种形态。loc数组用来记录字典树节点们的深度,last数组记录上一次的统计末尾处,配合loc可以判交叉。
(枯了 好像数组方便得多)
代码:
#include<bits/stdc++.h> #define FastIO ios_base::sync_with_stdio(false), cin.tie(NULL), cout.tie(NULL); #define inf 0x3f3f3f3f #define rep(i,a,b) for(ll i=a;i<b;i++) #define repp(i,a,b) for(ll i=a;i<=b;i++) #define rep1(i,a,b) for(ll i=a;i>=b;i--) #define mem(gv) memset(gv,0,sizeof(gv)) #define pb push_back #define mp make_pair #define fi first #define se second #define QAQ 0 #define miaojie #ifdef miaojie #define dbg(args...) do {cout << #args << " : "; err(args);} while (0) #else #define dbg(...) #endif void err() {std::cout << std::endl;} template<typename T, typename...Args> void err(T a, Args...args){std::cout << a << ‘ ‘; err(args...);} using namespace std; typedef long long ll; typedef pair<int,int> pii; typedef pair<ll,ll> pLL; const int mod=1e9+7; const int maxn=7e5+5; int idd; struct node { node *fail; node *child[26]; int id; node() { fail=NULL; id=idd++; for(int i=0;i<26;i++) child[i]=NULL; } }; int n,op[maxn]; int ans[maxn][2],pos[maxn],last[maxn],loc[maxn]; char s1[maxn][9]; char s2[maxn]; node *rt; void insert(node *t,char p[],int num) { int index,lp=strlen(p),q=0; while(p[q]!=‘\0‘) { index=p[q]-‘a‘; if(t->child[index]==NULL) t->child[index]=new node(); t=t->child[index]; loc[t->id]=q+1; q++; } pos[num]=t->id; } void ac(node *t) { queue <node*> q; t->fail=NULL; q.push(t); node *temp; node *tmp; int i; while(!q.empty()) { temp=q.front(); q.pop(); for(i=0;i<26;i++) { if(temp->child[i]!=NULL) { if(temp==t) temp->child[i]->fail=t; else { tmp=temp->fail; while(tmp!=NULL) { if(tmp->child[i]!=NULL) { temp->child[i]->fail=tmp->child[i]; break; } tmp=tmp->fail; } if(tmp==NULL) temp->child[i]->fail=t; } q.push(temp->child[i]); } } } } void query(node *t) { memset(last,-1,sizeof(last)); mem(ans); int index,q=0,len=strlen(s2); node *p=t; while(s2[q]) { index=s2[q]-‘a‘; while(p->child[index]==NULL &&p!=rt) p=p->fail; p=p->child[index]; if(!p) p=rt; node *tmp=p; while(tmp!=rt) { ans[tmp->id][0]++; if(q-last[tmp->id]>=loc[tmp->id]){ ans[tmp->id][1]++; last[tmp->id]=q; } tmp=tmp->fail; } q++; } } void del(node *root) { for(int i=0;i<26;i++) if(root->child[i]!=NULL) del(root->child[i]); delete root; root=NULL; } int main(){ int T=1; while(scanf("%s",s2)!=EOF){ mem(pos); mem(loc); idd=0; rt=new node(); scanf("%d",&n); repp(i,0,n-1){ scanf("%d",&op[i]); scanf("%s",s1[i]); insert(rt,s1[i],i); } ac(rt); query(rt); printf("Case %d\n",T++); repp(i,0,n-1){ printf("%d\n",ans[pos[i]][op[i]]); } del(rt); printf("\n"); } return QAQ; }
ZOJ3228 Searching the String (AC自动机)
标签:16px query sam ttl file %s cte oid c++
原文地址:https://www.cnblogs.com/miaomiaojie/p/10816397.html