码迷,mamicode.com
首页 > 其他好文 > 详细

HDU--2222--Keywords Search--AC自动机

时间:2015-08-17 01:09:16      阅读:139      评论:0      收藏:0      [点我收藏+]

标签:

Keywords Search

Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 131072/131072 K (Java/Others)
Total Submission(s): 44594    Accepted Submission(s): 14056


Problem Description
In the modern time, Search engine came into the life of everybody like Google, Baidu, etc.
Wiskey also wants to bring this feature to his image retrieval system.
Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched.
To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match.
 


 

Input
First line will contain one integer means how many cases will follow by.
Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000)
Each keyword will only contains characters ‘a‘-‘z‘, and the length will be not longer than 50.
The last line is the description, and the length will be not longer than 1000000.
 


 

Output
Print how many keywords are contained in the description.
 


 

Sample Input
1 5 she he say shr her yasherhs
 


 

Sample Output
3
 

题意:给定N个字符串,然后是一个文章,问你在文章中有多少个字符串是出现了的,不计算重复

ps:AC自动机,今天才学的,若有所感,很容易,真的,在博客中发表了这个算法的学习。求共同进步。

#include <iostream>
#include <cstdio>
#include <cstring>
using namespace
std;

struct
node
{

    node *fail,*next[26];//fail:指向同级节点,当匹配失败时跳转,含义为x后缀包含的x同级节点整个串
    int
x;
    node
()//用来初始化
    {

        fail=NULL;
        x=0;

        for
(int i=0;i<26;i++)next[i]=NULL;
    }
}*
root;
node *q[500010];

char
str[1000010];
void
setit(char *ss)//这是构建字典树
{

    int
i,j,k,l;
    node *p=root;

    for
(i=0;ss[i];i++)
    {

        k=ss[i]-‘a‘;

        if
(p->next[k]==NULL)//如果下一个字符无节点
        p->next[k]=new node();//创建并连接在p的后面
        p=p->next[k
];
    }

    p->x
++;
}

void
ac()//AC自动机算法的构建,跟KMP的next
{

    int
tail,head;
    tail=head=0;
    q[tail++]=root;//广搜每个节点,这样是按照树的深度一层一层来搜索,fail只会只想当前层以上的点,你懂的。

    while
(head!=tail)
    {

        node *p=q[head++];//取出当前点

        for
(int i=0;i<26;i++)//遍历26个字符,这里用ascll码表示
        if
(p->next[i]!=NULL)//如果存在
        {

            if
(p==root)//如果存在于根节点下面也就是第二层
            {

                p->next[i]->fail=root;//默认指向根节点,因为第一个字符都匹配错误,那可定从头开始啊
                q[tail++]=p->next[i];//入队

                continue
;
            }

            node *cur=p->fail;//取出查询字符的上一层节点,用cur找它的同级节点

            while
(cur!=NULL)//如果是空就结束,因为root的fail我初始化设定是NULL
            {

                if
(cur->next[i]!=NULL)//如果找到某个同级节点并且它后面有当前字符
                {

                    p->next[i]->fail=cur->next[i];//把当前字符的fail指针指向同级节点后面的那个字符的位置

                    break
;
                }

                cur=cur->fail
;
            }
//cout<<(p->next[i]==NULL?1:0)<<endl;
            if(cur==NULL)//如果没有找到合适的匹配
            p->next[i]->fail=root;//那么当前字符的fail就指向root根节点
            q[tail++]=p->next[i
];//入队
        }
    }
}

int
query()//在AC自动机算法处理后的字典树中对字符串进行匹配
{

    int
i,j,k,l,sum=0,cur;
    node *p=root;

    for
(i=0;str[i];i++)//遍历每一个字符
    {

        cur=str[i]-‘a‘;//取出字符

        while
(p->next[cur]==NULL&&p!=root)//如果节点的后面没有当前字符就继续找同级,直到根节点
        p=p->fail;

        if
(p->next[cur]!=NULL)p=p->next[cur];//跳向当前字符所在节点,如果没有找到合适,那么p=root,所以判断cur下一层是否有当前字符存在,不存在就继续让p指向root
        node *q=p;//用q替代p来进行操作

        while
(q!=root&&q->x!=-1)//查询所有同级节点,直到root或者已经被查询过
        {

            sum+=q->x;//把个数加起来
            q->x=-1;//查询过了就设置断点
            q=q->fail
;
        }
    }

    return
sum;
}

int
main (void)
{

    int
t,n,m,i,j,k,l;
    scanf("%d",&t);

    while
(t--&&scanf("%d",&n))
    {

        root=new node();

        for
(i=0;i<n;i++)
        {

            char
ss[55];
            scanf("%s",ss);
            setit(ss
);
        }

        ac();
//cout<<"A"<<endl;
        scanf("%s",str);
        printf("%d\n",query
());
    }

    return
0;
}

 

版权声明:本文为博主原创文章,未经博主允许不得转载。

HDU--2222--Keywords Search--AC自动机

标签:

原文地址:http://blog.csdn.net/jingdianitnan/article/details/47708811

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!