For a hash function \(f\),
\(x=y \Rightarrow f(x)=f(y) \)
\(x \neq y \Rightarrow f(x) \neq f(y) \) (very high prob.)
For two strings \(s,t\), if we want to know whether \(s\) and \(t\) are the same, we can hash them, and check if \(f(x)=f(y)\)
Given a string \(s_0...s_{n-1}\), define \(a[i]=s_i*p^i\)
Given a string \(s\), answer \(q\) querys:
given a string \(t\), print the number of occurences of \(t\) in \(s\)
\(|s|, |t| \leq 10000\)
\(q \leq 50000 \)
\( \sum |t| \leq 350000\)
Use prefix sum on hash, and then we can check if a string of length \(|t|\) starting from every position of \(s\) matches in \( \Omicron (|s|) \).
#include <bits/stdc++.h>
#define IO ios::sync_with_stdio(0);cin.tie(0);cout.tie(0);
#define int long long
using namespace std;
const int p=127,M=998244353;
int pref[10005],po[10005];
main(){
IO
po[0]=1;
for(int i=1;i<10005;i++){po[i]=po[i-1]*p;po[i]%=M;}
int tc;cin >> tc;
while(tc--){
string T;cin >> T;
pref[0]=T[0];
for(int i=1;i<T.length();i++) {pref[i]=pref[i-1]*p+T[i];pref[i]%=M;}
int q;cin >> q;
while(q--){
int has=0,cnt=0;
string P;cin >> P;
for(int i=0;i<P.length();i++){has=has*p+P[i];has%=M;}
for(int i=P.length()-1;i<T.length();i++){
if(i==P.length()-1&&has==pref[i]) cnt++;
else if(((pref[i]-(pref[i-P.length()]*po[P.length()])%M)+M)%M==has%M) cnt++;
}
cout << cnt << '\n';
}
}
return 0;
}
Given a string \(s_0...s_{n-1}\), define an array \(z\):
\(z[i]=\) the biggest \(k\) that satisfies
\(s_0...s_{k-1}=s_is_{i+1}...s_{i+k-1}\)
(\(k=0\) if \(s_0 \neq s_i\))
Say we know \(z[0] \sim z[i-1]\).
First, we try to find the lower bound of \(z[i]\)
let \(l= \argmax_{0 \leq j \leq i-1}l+z[j]-1, r=l+z[l]-1\)
\(\Rightarrow s_0...s_{r-l}=s_l...s_r\).
if \(i \leq r\), we know that \(s_{i-l}...s_{r-l}=s_i...s_r\),
\( \Rightarrow z[i]\) is at least \(min(z[i-l],r-i+1)\)
Then, we can repeatedly check if \(s[z[i]]=s[i+z[i]]\),
and update \(z[i]\).
Finally, we can update \(l,r\) if \(i+z[i]-1 > r\).
Notice that \(r\) is increasing, and every time \(r\) increases requires \( \Omicron (1) \), so the algorithm is \( \Omicron (n) \) amortized.
vector<int> z_algo(string &s){
int n=s.size();
vector<int> z(n,0);
for(int i=1,l=0,r=0;i<n;i++){
if(i<=r) z[i]=min(z[i-l],r-i+1);
while(i+z[i]<n&&s[z[i]]==s[i+z[i]]) z[i]++;
if(i+z[i]-1>r) l=i,r=i+z[i]-1;
}
return z;
}
Given a string \(s\), find the number of strings which satisfy:
A string \(t\) is a prefix and also a suffix of \(s\).
Count of different \(i\) which \(i+z[i]-1=n\).
Given a string \(s_0...s_{n-1}\), define failure function \(p\):
\(p[i]=\) the biggest \(k<i+1\) that satisfy
\(s_0...s_{k-1}=s_{i-k+1}...s_i\)
Say we know \(p[0] \sim p[i-1]\).
let \(j=i-1\), if \(s_{p[j]}=s_i \Rightarrow p[i]=p[j]+1\)
otherwise we can keep making \(j=p[j-1]\) when \(j \neq 0\), and check if the condition is satisfied.
Notice that \(j\) will be added only \( \Omicron(n) \) times, so the algorithm is \( \Omicron(n) \) amortized.
vector<int> kmp(string &s){
int n=s.size();
vector<int> pi(n,0);
for(int i=1;i<n;i++){
int j=pi[i-1];
while(j>0&&s[i]!=s[j]) j=pi[j-1];
if(s[i]==s[j]) j++;
pi[i]=j;
}
return pi;
}
Given a string \(s\), answer \(q\) querys:
given a string \(t\), print the number of occurences of \(t\) in \(s\)
\(|s|, |t| \leq 10000\)
\(q \leq 50000 \)
\( \sum |t| \leq 350000\)
For every \(t\), calculate its failure function.
Maintain \(r\) where we match two string to \(s_i\) and \(t_r\),
if \(s_{i+1} \neq t_{r+1}\), we can make \(r=p[r]\) and keep matching.
The name comes from that, if we failed on matching, we can switch to the largest possible position instantly.
Given a string \(s\), find the longest palindrome substring.
\(|s| \leq 10^6 \)
Palindromes include two kinds:
1. odd length, center is a position in the string
2. even length, center is a position between two characters
Hard to deal with...
insert '*' between every two characters, the front and the end of the string, all palindromes become odd length (2*len+1)!
for a string \(s\) (after inserting '*'), define an array \(p\):
\(p[i]=\) the biggest \(k\) so that \(s_{i-k+1}...s_i=s_i...s_{i+k-1}\)
Then, how can we construct the array?
Say we have \(p[0] \sim p[i-1] \).
Let \(x= \argmax_{0 \leq j \leq i-1} j+p[j]-1\),
since \(s_{x-p[x]+1}...s_x=s_x...s_{x+p[x]-1}\),
\( \Rightarrow p[i] \geq min(p[2x-i], p[x]-(i-x))\)
Same idea with Z-algorithm!
vector<int> manacher(string &ss){
string s;
s.resize(ss.size()*2+1,'.');
for(int i=0;i<ss.size();i++){
s[i*2+1]=ss[i];
}
vector<int> p(s.size(),1);
for(int i=0,l=0,r=0;i<s.size();i++){
p[i]=max(min(p[l*2-i],r-i),1LL);
while(0<=i-p[i]&&i+p[i]<s.size()&&s[i-p[i]]==s[i+p[i]]){
l=i,r=i+p[i],p[i]++;
}
}
return p;
}
//didn't compile
int ch[N][26]{0},cnt[N]{0},ptr=0;
void insert(string &s){
int cur=0;
for(int i=0;i<s.length();i++){
if(!ch[cur][s[i]-'a']) ch[cur][s[i]-'a']=++ptr;
cur=ch[cur][s[i]-'a'];
}
cnt[cur]++;
}
so ez la
2021 北市賽 pB
我忘記題目了lol
Given an array \(a\), find the pair \((i,j)\) where \(a_i \oplus a_j\) is the biggest among all pairs.
\(n \leq 10^5, a_i \leq 10^9 \)
(The stronger version of TIOJ 1306)
\(|s| \leq 10^5\)
\( \sum |t| \leq 5*10^5\)
Couldn't AC with hash, Z, or kmp...
Trie with fail link!
A fail link from \(u\) to \(v\): \(v\) represents the longest suffix of \(u\) which exists in the trie.
Remember what failure function is?
Tree edge: just a trie
Fail link: a simple bfs would work!
const int N=5e5+5;
int ch[N][26]{0},fail[N]{0},ptr=0;
void insert(string &s,int ind){
int cur=0;
for(int i=0;i<s.size();i++){
if(!ch[cur][s[i]-'a']) ch[cur][s[i]-'a']=++ptr;
cur=ch[cur][s[i]-'a'];
}
}
void build(){
queue<int> q;
for(int i=0;i<26;i++) if(ch[0][i]) q.push(ch[0][i]);
while(!q.empty()){
int cur=q.front();q.pop();
for(int i=0;i<26;i++){
if(!ch[cur][i]) ch[cur][i]=ch[fail[cur]][i];
else{
q.push(ch[cur][i]);
int tem=fail[cur];
while(tem&&!ch[tem][i]) tem=fail[tem];
fail[ch[cur][i]]=ch[tem][i];
}
}
}
}
Matching: just walk on tree edge, and if there isn't one, take the fail link.
Maintain a count of times of visit on each vertex, and then a dfs is required.
const int N=2e5+5;
string s;
int n,p[N],pn[N],c0[N],c1[N],*c,*cn,cnt[N]{0};
void SA(string s){
s+='$';//cyclic or not
n=s.length();
c=c0;cn=c1;
//length=1
for(int i=0;i<n;i++) cnt[s[i]]++;
for(int i=0;i<256;i++) cnt[i]+=cnt[i-1];//256: sigma size
for(int i=0;i<n;i++){
p[--cnt[s[i]]]=i;
}
int cl=0;
c[p[0]]=cl;
for(int i=1;i<n;i++){
if(s[p[i]]!=s[p[i-1]]) cl++;
c[p[i]]=cl;
}
for(int k=1;k<n;k*=2){
//sorting mp(c[i-k],c[i]), c[i] already sorted in p[]
for(int i=0;i<=max(256LL,cl);i++) cnt[i]=0;//256
for(int i=0;i<n;i++){
pn[i]=p[i]-k;
if(pn[i]<0) pn[i]+=n;
}
for(int i=0;i<n;i++){
cnt[c[pn[i]]]++;
}
for(int i=1;i<=cl;i++) cnt[i]+=cnt[i-1];
for(int i=n-1;i>=0;i--){
p[--cnt[c[pn[i]]]]=pn[i];
}
cl=0;
cn[p[0]]=cl;
for(int i=1;i<n;i++){
auto prev=mp(c[p[i-1]],c[(p[i-1]+k)%n]);
auto cur=mp(c[p[i]],c[(p[i]+k)%n]);
if(prev!=cur) cl++;
cn[p[i]]=cl;
}
swap(c,cn);
}
//making all rank different
//for(int i=0;i<n;i++) c[p[i]]=i;
}
//p: starting indices after sort
//c: rank of indices, may be same if '$' not added
int lcp[N][20],po[20];
void LCP(){
po[0]=1;
for(int i=1;i<20;i++) po[i]=po[i-1]*2;
int k=0;
for(int i=0;i<n;i++){
if(c[i]==n-1){
k=0;
continue;
}
int j=p[c[i]+1];
while(i+k<n&&j+k<n&&a[i+k]==a[j+k]) k++;
lcp[c[i]][0]=k;
if(k) k--;
}
for(int j=1;j<20;j++){
for(int i=0;i<n-1;i++){
if(i+po[j-1]<n-1) lcp[i][j]=min(lcp[i][j-1],lcp[i+po[j-1]][j-1]);
}
}
}
//lcp[i][0]: longest common prefix of s.substr(p[i]), s.substr(p[i+1])
//lcp: a sparse table
int qry(int i,int j){
i=c[i],j=c[j];
if(i>j) swap(i,j);
int lg=__lg(j-i);
return min(lcp[i][lg],lcp[j-po[lg]][lg]);
}
順道提醒大家北市賽一定要好好喇分!
(去年好像有人說Sam會成為建中培訓的教材,這不就來了嗎)
TIOJ 1927
ARC 151E
ABC 268 Ex
CF 1562 E
CF 1721 E
CF 985 F
CF 1366 G
CF 1363 F
CF 1313 E
Finding repetitions - Algorithms for Competitive Programming (cp-algorithms.com)
Just learned this algorithm this Tuesday, very cool though.