# Journal Club:

Fast and accurate short read alignment with BWT

### SA and BWT

`X = abracadabra\$`
```\$abracadabra
dabra\$abraca
```

sorted rotations

BWT

```\$
a\$
abra\$
bra\$
dabra\$
ra\$
```
```\$abracadabra
dabra\$abraca
```
```a
r
d
\$
r
c
a
a
a
a
b
b```

BWT

### Backward search

```\$
a\$
abra\$
bra\$
dabra\$
ra\$
```
```a
r
d
\$
r
c
a
a
a
a
b
b```

BWT

Exact pattern matching in O(m):

= find interval of all suffixes for which pattern W is a prefix

• init: \(I=[s,e] = [1,n - 1]\)
• read pattern from right to left
• update:
\(I = [C(W_i)+O(W_i, s-1) + 1,C(W_i) + O(W_i, e)]\)

C(a):

# of smaller char occurences in X[0,n-2]

O(a, i):

# of a occurences in BWT[0,i]

`X = abracadabra\$, |X|=n=12`

suffixes with \(W_i...\)

suffixes before prev end which are preceeded by W_i

suffixes before prev start which are preceeded by W_i

```\$
a\$
abra\$
bra\$
dabra\$
ra\$
```
```a
r
d
\$
r
c
a
a
a
a
b
b```

BWT

`W = ra`

Exact pattern matching in O(m):

= find interval of all suffixes for which pattern W is a prefix

• init: \(I=[s,e] = [1,n - 1]\)
• read pattern from right to left
• update:
\(I = [C(W_i)+O(W_i, s-1) + 1,C(W_i) + O(W_i, e)]\)

C(a):

# of smaller char occurences in X[0,n-2]

O(a, i):

# of a occurences in BWT[0,i]

`X = abracadabra\$, |X|=n=12`

### Backward search

suffixes with \(W_i...\)

suffixes before prev end which are preceeded by W_i

suffixes before prev start which are preceeded by W_i

`X = abracadabra\$, |X|=n=12`

C(a):

# of smaller char occurences in X[0,n-2]

O(a, i):

# of a occurences in BWT[0,i]

```\$
a\$
abra\$
bra\$
dabra\$
ra\$
```
```a
r
d
\$
r
c
a
a
a
a
b
b```

BWT

`W = ra`

\(C(a)=0\)

\(O(a,0)=0\)

\(C(a) = 0\)

\(O(a,11) = 5\)

Exact pattern matching in O(m):

= find interval of all suffixes for which pattern W is a prefix

• init: \(I=[s,e] = [1,n - 1]\)
• read pattern from right to left
• update:
\(I = [C(W_i)+O(W_i, s-1) + 1,C(W_i) + O(W_i, e)]\)

\(=[1,5]\)

### Backward search

suffixes with \(W_i...\)

suffixes before prev end which are preceeded by W_i

suffixes before prev start which are preceeded by W_i

```\$
a\$
abra\$
bra\$
dabra\$
ra\$
```
```a
r
d
\$
r
c
a
a
a
a
b
b```

BWT

`W = ra`

Exact pattern matching in O(m):

= find interval of all suffixes for which pattern W is a prefix

• init: \(I=[s,e] = [1,n - 1]\)
• read pattern from right to left
• update:
\(I = [C(W_i)+O(W_i, s-1) + 1,C(W_i) + O(W_i, e)]\)

C(a):

# of smaller char occurences in X[0,n-2]

O(a, i):

# of a occurences in BWT[0,i]

`X = abracadabra\$, |X|=n=12`

\(C(r)=9\)

\(O(r,0)=0\)

\(C(r) = 9\)

\(O(r,5) = 2\)

\(=[10,11]\)

suffixes with \(W_i...\)

suffixes before prev end which are preceeded by W_i

suffixes before prev start which are preceeded by W_i

### Backward search

#### Journal Club:

By Johannes Köster

• 159