Abkhazia tutorial

forced alignment of speech corpora ...

... made easy

Mathieu Bernard - CoML

https://github.com/bootphon/abkhazia

Forced alignment

Input

wav file

annotation file

lexicon file

s0102a-sent17 that's what i <SIL> <NOISE> recall

s0102a-sent17.wav

<SIL> SIL
<NOISE> NSN
that's dh ae t s
what w ah t
i ah
recall r iy k ao l

s0102a-sent17 0.0000 0.3675 SIL
s0102a-sent17 0.3675 0.5675 dh that's
s0102a-sent17 0.5675 0.7675 ae
s0102a-sent17 0.7675 0.7975 t
s0102a-sent17 0.7975 1.9275 s
s0102a-sent17 1.9275 3.0275 SIL
s0102a-sent17 3.0275 3.0575 w what
s0102a-sent17 3.0575 3.0875 ah
s0102a-sent17 3.0875 3.1975 t
s0102a-sent17 3.1975 3.2275 ah i
s0102a-sent17 3.2275 3.3875 SIL <SIL>
s0102a-sent17 3.3875 3.5075 NSN <NOISE>
s0102a-sent17 3.5075 3.7175 r recall
s0102a-sent17 3.7175 3.8475 iy
s0102a-sent17 3.8475 4.2075 k
s0102a-sent17 4.2075 4.3575 ao
s0102a-sent17 4.3575 4.4075 l

Output

alignment file

alignment at phones and/or words level

For that tutorial

You have
- raw speech files (e.g. wav)
- annotation on them (e.g. TextGrid)

You want
- alignement of text on speech
- at word or phone level

Install abkhazia

abkhazia is a python3 library
guidelines: https://coml.lscp.ens.fr/abkhazia/install.html
or use the docker image:

docker pull cognitiveml/abkhazia:version-1.0

abkhazia tutorial

By mmmaat

Abkhazia tutorial

forced alignment of speech corpora ...

... made easy

Forced alignment

For that tutorial

Install abkhazia

abkhazia tutorial

More from mmmaat