Fast and accurate short read alignment with BWT
X = abracadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
sorted rotations
BWT
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
a r d $ r c a a a a b b
BWT
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
suffixes with \(W_i...\)
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
suffixes with \(W_i...\)
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
X = abracadabra$, |X|=n=12
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
\(C(a)=0\)
\(O(a,0)=0\)
\(C(a) = 0\)
\(O(a,11) = 5\)
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
\(=[1,5]\)
suffixes with \(W_i...\)
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
\(C(r)=9\)
\(O(r,0)=0\)
\(C(r) = 9\)
\(O(r,5) = 2\)
\(=[10,11]\)
suffixes with \(W_i...\)
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i