Journal Club:
Fast and accurate short read alignment with BWT





SA and BWT
X = abracadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
sorted rotations
BWT
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
$abracadabra a$abracadabr abra$abracad abracadabra$ acadabra$abr adabra$abrac bra$abracada bracadabra$a cadabra$abra dabra$abraca ra$abracadab racadabra$ab
a r d $ r c a a a a b b
BWT
Backward search
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
- init: I=[s,e]=[1,n−1]
- read pattern from right to left
- update:
I=[C(Wi)+O(Wi,s−1)+1,C(Wi)+O(Wi,e)]
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
- init: I=[s,e]=[1,n−1]
- read pattern from right to left
- update:
I=[C(Wi)+O(Wi,s−1)+1,C(Wi)+O(Wi,e)]
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
Backward search
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
X = abracadabra$, |X|=n=12
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
C(a)=0
O(a,0)=0
C(a)=0
O(a,11)=5
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
- init: I=[s,e]=[1,n−1]
- read pattern from right to left
- update:
I=[C(Wi)+O(Wi,s−1)+1,C(Wi)+O(Wi,e)]
=[1,5]
Backward search
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
$ a$ abra$ abracadabra$ acadabra$ adabra$ bra$ bracadabra$ cadabra$ dabra$ ra$ racadabra$
a r d $ r c a a a a b b
BWT
W = ra
Exact pattern matching in O(m):
= find interval of all suffixes for which pattern W is a prefix
- init: I=[s,e]=[1,n−1]
- read pattern from right to left
- update:
I=[C(Wi)+O(Wi,s−1)+1,C(Wi)+O(Wi,e)]
C(a):
# of smaller char occurences in X[0,n-2]
O(a, i):
# of a occurences in BWT[0,i]
X = abracadabra$, |X|=n=12
C(r)=9
O(r,0)=0
C(r)=9
O(r,5)=2
=[10,11]
suffixes with Wi...
suffixes before prev end which are preceeded by W_i
suffixes before prev start which are preceeded by W_i
Backward search
Journal Club:
By Johannes Köster
Journal Club:
- 584