Fast and accurate short read alignment with BWT
X = abracadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
sorted rotations
BWT
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
a r d $ r c a a a a b b
BWT
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
X = abracadabra$, |X|=n=12
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
C(a)=0
O(a,0)=0
C(a)=0
O(a,11)=5
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
=[1,5]
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
C(r)=9
O(r,0)=0
C(r)=9
O(r,5)=2
=[10,11]
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i