• Hash
• Z-Algorithm
• KMP
• Manacher
• Trie
• AC Automaton
• Suffix Array
• Main-Lorentz

# Hash

## What is hash?

For a hash function $$f$$,

$$x=y \Rightarrow f(x)=f(y)$$

$$x \neq y \Rightarrow f(x) \neq f(y)$$ (very high prob.)

For two strings $$s,t$$, if we want to know whether $$s$$ and $$t$$ are the same, we can hash them, and check if $$f(x)=f(y)$$

## Rabin Karp

Given a string $$s_0...s_{n-1}$$, define $$a[i]=s_i*p^i$$

## Problem

TIOJ 1306

Given a string $$s$$, answer $$q$$ querys:

given a string $$t$$, print the number of occurences of $$t$$ in $$s$$

$$|s|, |t| \leq 10000$$

$$q \leq 50000$$

$$\sum |t| \leq 350000$$

## Solution

Use prefix sum on hash, and then we can check if a string of length $$|t|$$ starting from every position of $$s$$ matches in $$\Omicron (|s|)$$.

## Solution

#include <bits/stdc++.h>
#define IO ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
#define int long long
using namespace std;

const int p=127,M=998244353;

int pref[10005],po[10005];

main(){
IO
po[0]=1;
for(int i=1;i<10005;i++){po[i]=po[i-1]*p;po[i]%=M;}
int tc;cin >> tc;
while(tc--){
string T;cin >> T;
pref[0]=T[0];
for(int i=1;i<T.length();i++) {pref[i]=pref[i-1]*p+T[i];pref[i]%=M;}
int q;cin >> q;
while(q--){
int has=0,cnt=0;
string P;cin >> P;
for(int i=0;i<P.length();i++){has=has*p+P[i];has%=M;}
for(int i=P.length()-1;i<T.length();i++){
if(i==P.length()-1&&has==pref[i]) cnt++;
else if(((pref[i]-(pref[i-P.length()]*po[P.length()])%M)+M)%M==has%M) cnt++;
}
cout << cnt << '\n';
}
}
return 0;
}



# Z-algorithm

## Z-algorithm

Given a string $$s_0...s_{n-1}$$, define an array $$z$$:

$$z[i]=$$ the biggest $$k$$ that satisfies

$$s_0...s_{k-1}=s_is_{i+1}...s_{i+k-1}$$

($$k=0$$ if $$s_0 \neq s_i$$)

## Calculate $$z$$

Say we know $$z[0] \sim z[i-1]$$.

First, we try to find the lower bound of $$z[i]$$

let $$l= \argmax_{0 \leq j \leq i-1}l+z[j]-1, r=l+z[l]-1$$

$$\Rightarrow s_0...s_{r-l}=s_l...s_r$$.

if $$i \leq r$$, we know that $$s_{i-l}...s_{r-l}=s_i...s_r$$,

$$\Rightarrow z[i]$$ is at least $$min(z[i-l],r-i+1)$$

## Calculate $$z$$

Then, we can repeatedly check if $$s[z[i]]=s[i+z[i]]$$,

and update $$z[i]$$.

Finally, we can update $$l,r$$ if $$i+z[i]-1 > r$$.

Notice that $$r$$ is increasing, and every time $$r$$ increases requires $$\Omicron (1)$$, so the algorithm is $$\Omicron (n)$$ amortized.

## Implementation

vector<int> z_algo(string &s){
int n=s.size();
vector<int> z(n,0);
for(int i=1,l=0,r=0;i<n;i++){
if(i<=r) z[i]=min(z[i-l],r-i+1);
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]) z[i]++;
if(i+z[i]-1>r) l=i,r=i+z[i]-1;
}
return z;
}

## Problem

CSES Finding Borders

Given a string $$s$$, find the number of strings which satisfy:

A string $$t$$ is a prefix and also a suffix of $$s$$.

## Solution

Count of different $$i$$ which $$i+z[i]-1=n$$.

# KMP

## KMP

Given a string $$s_0...s_{n-1}$$, define failure function $$p$$:

$$p[i]=$$ the biggest $$k<i+1$$ that satisfy

$$s_0...s_{k-1}=s_{i-k+1}...s_i$$

## Build

Say we know $$p[0] \sim p[i-1]$$.

let $$j=i-1$$, if $$s_{p[j]}=s_i \Rightarrow p[i]=p[j]+1$$

otherwise we can keep making $$j=p[j-1]$$ when $$j \neq 0$$, and check if the condition is satisfied.

## Build

Notice that $$j$$ will be added only $$\Omicron(n)$$ times, so the algorithm is $$\Omicron(n)$$ amortized.

## Implementation

vector<int> kmp(string &s){
int n=s.size();
vector<int> pi(n,0);
for(int i=1;i<n;i++){
int j=pi[i-1];
while(j>0&&s[i]!=s[j]) j=pi[j-1];
if(s[i]==s[j]) j++;
pi[i]=j;
}
return pi;
}


## Problem

TIOJ 1306

Given a string $$s$$, answer $$q$$ querys:

given a string $$t$$, print the number of occurences of $$t$$ in $$s$$

$$|s|, |t| \leq 10000$$

$$q \leq 50000$$

$$\sum |t| \leq 350000$$

## Solution

For every $$t$$, calculate its failure function.

Maintain $$r$$ where we match two string to $$s_i$$ and $$t_r$$,

if $$s_{i+1} \neq t_{r+1}$$, we can make $$r=p[r]$$ and keep matching.

## Why failure function?

The name comes from that, if we failed on matching, we can switch to the largest possible position instantly.

# Manacher's Algorithm

## Problem

CSES - Longest Palindrome

Given a string $$s$$, find the longest palindrome substring.

$$|s| \leq 10^6$$

## First of all

Palindromes include two kinds:

1. odd length, center is a position in the string

2. even length, center is a position between two characters

Hard to deal with...

insert '*' between every two characters, the front and the end of the string, all palindromes become odd length (2*len+1)!

## Construct

for a string $$s$$ (after inserting '*'), define an array $$p$$:

$$p[i]=$$ the biggest $$k$$ so that $$s_{i-k+1}...s_i=s_i...s_{i+k-1}$$

Then, how can we construct the array?

## Calculate $$p$$

Say we have $$p[0] \sim p[i-1]$$.

Let $$x= \argmax_{0 \leq j \leq i-1} j+p[j]-1$$,

since $$s_{x-p[x]+1}...s_x=s_x...s_{x+p[x]-1}$$,

$$\Rightarrow p[i] \geq min(p[2x-i], p[x]-(i-x))$$

Same idea with Z-algorithm!

## Implementation

vector<int> manacher(string &ss){
string s;
s.resize(ss.size()*2+1,'.');
for(int i=0;i<ss.size();i++){
s[i*2+1]=ss[i];
}
vector<int> p(s.size(),1);
for(int i=0,l=0,r=0;i<s.size();i++){
p[i]=max(min(p[l*2-i],r-i),1LL);
while(0<=i-p[i]&&i+p[i]<s.size()&&s[i-p[i]]==s[i+p[i]]){
l=i,r=i+p[i],p[i]++;
}
}
return p;
}


# Trie

## Implementation

//didn't compile
int ch[N][26]{0},cnt[N]{0},ptr=0;
void insert(string &s){
int cur=0;
for(int i=0;i<s.length();i++){
if(!ch[cur][s[i]-'a']) ch[cur][s[i]-'a']=++ptr;
cur=ch[cur][s[i]-'a'];
}
cnt[cur]++;
}

so ez la

2021 北市賽 pB

## Problem

Given an array $$a$$, find the pair $$(i,j)$$ where $$a_i \oplus a_j$$ is the biggest among all pairs.

$$n \leq 10^5, a_i \leq 10^9$$

# AC Automaton

## Problem

CSES - Finding Patterns

CSES - Counting Patterns

(The stronger version of TIOJ 1306)

$$|s| \leq 10^5$$

$$\sum |t| \leq 5*10^5$$

Couldn't AC with hash, Z, or kmp...

## Aho-Corasick Algorithm

Trie with fail link!

A fail link from $$u$$ to $$v$$: $$v$$ represents the longest suffix of $$u$$ which exists in the trie.

Remember what failure function is?

## Build

Tree edge: just a trie

Fail link: a simple bfs would work!

## Implemtation

const int N=5e5+5;
int ch[N][26]{0},fail[N]{0},ptr=0;

void insert(string &s,int ind){
int cur=0;
for(int i=0;i<s.size();i++){
if(!ch[cur][s[i]-'a']) ch[cur][s[i]-'a']=++ptr;
cur=ch[cur][s[i]-'a'];
}
}

void build(){
queue<int> q;
for(int i=0;i<26;i++) if(ch[0][i]) q.push(ch[0][i]);
while(!q.empty()){
int cur=q.front();q.pop();
for(int i=0;i<26;i++){
if(!ch[cur][i]) ch[cur][i]=ch[fail[cur]][i];
else{
q.push(ch[cur][i]);
int tem=fail[cur];
while(tem&&!ch[tem][i]) tem=fail[tem];
fail[ch[cur][i]]=ch[tem][i];
}
}
}
}

## So how to solve the problem?

Matching: just walk on tree edge, and if there isn't one, take the fail link.

Maintain a count of times of visit on each vertex, and then a dfs is required.

# Suffix Array

## My implementation

const int N=2e5+5;
string s;
int n,p[N],pn[N],c0[N],c1[N],*c,*cn,cnt[N]{0};

void SA(string s){
s+='$';//cyclic or not n=s.length(); c=c0;cn=c1; //length=1 for(int i=0;i<n;i++) cnt[s[i]]++; for(int i=0;i<256;i++) cnt[i]+=cnt[i-1];//256: sigma size for(int i=0;i<n;i++){ p[--cnt[s[i]]]=i; } int cl=0; c[p[0]]=cl; for(int i=1;i<n;i++){ if(s[p[i]]!=s[p[i-1]]) cl++; c[p[i]]=cl; } for(int k=1;k<n;k*=2){ //sorting mp(c[i-k],c[i]), c[i] already sorted in p[] for(int i=0;i<=max(256LL,cl);i++) cnt[i]=0;//256 for(int i=0;i<n;i++){ pn[i]=p[i]-k; if(pn[i]<0) pn[i]+=n; } for(int i=0;i<n;i++){ cnt[c[pn[i]]]++; } for(int i=1;i<=cl;i++) cnt[i]+=cnt[i-1]; for(int i=n-1;i>=0;i--){ p[--cnt[c[pn[i]]]]=pn[i]; } cl=0; cn[p[0]]=cl; for(int i=1;i<n;i++){ auto prev=mp(c[p[i-1]],c[(p[i-1]+k)%n]); auto cur=mp(c[p[i]],c[(p[i]+k)%n]); if(prev!=cur) cl++; cn[p[i]]=cl; } swap(c,cn); } //making all rank different //for(int i=0;i<n;i++) c[p[i]]=i; } //p: starting indices after sort //c: rank of indices, may be same if '$' not added

int lcp[N][20],po[20];

void LCP(){
po[0]=1;
for(int i=1;i<20;i++) po[i]=po[i-1]*2;
int k=0;
for(int i=0;i<n;i++){
if(c[i]==n-1){
k=0;
continue;
}
int j=p[c[i]+1];
while(i+k<n&&j+k<n&&a[i+k]==a[j+k]) k++;
lcp[c[i]][0]=k;
if(k) k--;
}
for(int j=1;j<20;j++){
for(int i=0;i<n-1;i++){
if(i+po[j-1]<n-1) lcp[i][j]=min(lcp[i][j-1],lcp[i+po[j-1]][j-1]);
}
}
}
//lcp[i][0]: longest common prefix of s.substr(p[i]), s.substr(p[i+1])
//lcp: a sparse table

int qry(int i,int j){
i=c[i],j=c[j];
if(i>j) swap(i,j);
int lg=__lg(j-i);
return min(lcp[i][lg],lcp[j-po[lg]][lg]);
}

(去年好像有人說Sam會成為建中培訓的教材，這不就來了嗎)

TIOJ 1927

ARC 151E

ABC 268 Ex

CF 1562 E

CF 1721 E

CF 985 F

CF 1366 G

CF 1363 F

CF 1313 E

# Main-Lorentz

## Problem

Finding repetitions - Algorithms for Competitive Programming (cp-algorithms.com)

Just learned this algorithm this Tuesday, very cool though.

By peter940324

• 423