hash哈希

时间：2021-06-20 17:45:55 阅读：0 评论：0 收藏：0 [点我收藏+]

标签：字符 signed 选择 image 记录 name 元组 sys code

字符串哈希就是将字符串转化为一个整数

Hash方法

给定一个字符串 \(s=s_1,s_2,s_3 ... s_n\) ，
对字母\(x\)，我们规定 \(idx(x)=x?′a′+1\)。
（当然也可以直接用 \(s_i\)的 ASCII值）

1.自然溢出方法

\[hash[i]=hash[i?1]?p+idx(s[i]) \]

利用 \(ull\) 的自然溢出，对其取模

2.单hash方法

\(hash\)公式：

\[hash[i]=(hash[i-1]*p+idx(s[i])\% mod \]

\(p\) 和 \(mod\) 均为质数，且 \(p<mod\) (尽量取大）

假设 \(s=abc，p=13，mod =101\)

hash[0]=1;
hash[1]=(hash[0]*13+2)%101=15;
hash[2]=(hash[1]*13+3)%101=97;

\(abc=97\) , \(97\) 即为 \(abc\) 哈希值；

3.双Hash方法

将一个字符串用不同数值的 \(mod\) \(hash\) 两次，将这两个结果用一个二元组表示，作为 \(Hash\) 结果。

\(Hash\) 公式

\[hash1[i]=(hash1[i?1])?p+idx(s[i]) \% mod1 \]

\[hash2[i]=(hash2[i?1])?p+idx(s[i]) \% mod2 \]

用\(map\)存储结果为 \(<hash1[n],hash2[n]>\)

这种\(hash\)很安全。

获取子串的哈希

若已知一个 \(|s|=n\) 的字符串的 \(hash\) 值，\(hash[i],1≤i≤n\) 。

其子串
\([S_l..S_r],1≤l≤r≤n\) 对应的 \(hash=f(r-l)\) 值为：

\[f(r-l)=hash[r]?hash[l?1]?p^{p-l+1} \]

考虑到 \(hash[i]\) 每次对 \(p\) 取模，进一步得到下面的式子：

\[f(r-l)=(hash[r]?hash[l?1]?p^{r?l+1})\%mod \]

括号里面是减法，即有可能是负数，故做如下的修正：

\[f(r-l)=((hash[r]?hash[l?1]?p^{r?l+1})\%mod+mod)\%mod \]

至此得到求子串 \(hash\) 值公式。

我们可以预先求出来 \(p^{\sum^n_1}\) 的各项值，直接求解即可。

接下来是可选择的hash系数

技术图片

例题

CF1200E Compress Words

题意：

给定几个字符串，按顺序求对于他们的最小表达，例如 want to ,最小表达为 wanto

解法：

看到这个题，我们可以快速得出来一个解决方法，比较前字符串的后缀和后字符串的前缀，如果相同，则不将后字符串的相同部分进入答案，如果不同，那么就加进去答案。

用双 \(hash\) 处理，对于答案序列设 \(h1[n],h2[n]\) ,当前序列为 \(hash1,hash2\)

那么此时序列成立的条件是：

\[h1[cnt]=(hash1+h1[cnt-i]*p1[i])\%mod1 \]

\[h2[cnt]=(hash2+h2[cnt-i]*p2[i])\%mod2 \]

如果成立，就把当前的i进行记录 \(L=i\)

向答案序列中加入的就是序列 \(s[L-len]\)。
代码：

#include<bits/stdc++.h>
using namespace std;
#define ll long long
const int N=1e6+6,mod1=19260817,mod2=19491001;
ll p1,p2;
int n,cnt=0;
ll P1[N],P2[N],h1[N],h2[N];
char s[N],ans[N];
void init(){
    srand(time(0));
	p1=rand()%321+233,p2=rand()%233+321;
    P1[0]=P2[0]=1;
    for(int i=1;i<=N-6;i++) P1[i]=(ll)P1[i-1]*p1%mod1,P2[i]=(ll)P2[i-1]*p2%mod2;//cout<<P1[i]<<" ";
}
signed main()
{
    cin>>n;
    init();
    for(int T=1;T<=n;T++){
        scanf("%s",s+1); 
        int len=strlen(s+1);
        int hash1=0,hash2=0,L=0;
        for(int i=1;i<=len&&i<=cnt;i++){
            hash1=((ll)hash1*p1+s[i])%mod1;
            hash2=((ll)hash2*p2+s[i])%mod2;
            if(h1[cnt]==(hash1+(ll)h1[cnt-i]*P1[i])%mod1&& 
               h2[cnt]==(hash2+(ll)h2[cnt-i]*P2[i])%mod2)
                L=i;
        }
        for(ll i=L+1;i<=len;i++){
            ans[++cnt]=s[i];
            h1[cnt]=((ll)h1[cnt-1]*p1+s[i])%mod1;
            h2[cnt]=((ll)h2[cnt-1]*p2+s[i])%mod2;
        }
    }
    printf("%s",ans+1);
    //system("pause");
    return 0;
}

hash哈希

标签：字符 signed 选择 image 记录 name 元组 sys code

原文地址：https://www.cnblogs.com/guanlexiangfan/p/14905041.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行