Improving STT recognition for the Bundestag Germany

Andreas Unterhuber and Daniel Morandini

KIM Keep In Mind GmbH

25/10/2023 - 14th Workshop Computer Science Research Meets Business

www.keepinmind.info

Solution

Results

Goals

Results

Goals

Solution

Automatic subtitle generation, player screenshot

Humans in the loop, subtitle editor screenshot

Still too much human intervention!

How to improve speech recognition?

Solution

Results

Goals

314k total sentence pairs

124k errorful sentences (30%)

Bundestag's Dataset in numbers

(late 2022)

New STT engine or NLP?

Is it a translation problem?

Original Sentence

Corrected Sentence

align tokens and corrections

Corrections Vocabulary

Original Sentence

Corrected Sentence

find correction tags for each original token

Corrections Vocabulary

Sentence

Correction Labels

Bundebert: a NLP model for turning German into "Bundestag German"

Release V1

python
300k samples
trained on on-premise GPU servers
torchserved

How to describe its potential? What does it get right/wrong?

What did we teach?

CATEGORY	IMPACT
R:ORTH	13%
M:PUNCT	10%
R:SPELL	10%
R:NOUN	7%
U:PUNCT	7%
R:PUNCT	4%
TOTAL	51%

Supported ERRANT categories with impact

Release V2

python
300k real + 600k synthetic sentences
trained on AWS sagemaker
evaluated with ERRANT
torchserved

Solution

Results

Goals

=========== Token-Based Detection ============
TP	FP	FN	Prec	Rec	F0.5
46625	24798	60177	0.6528	0.4366	0.594
==============================================

Human editors reduced from 3 (pros) to 1 (student)!

UI/UX

Improving STT recognition for the Bundestag Germany

Andreas Unterhuber and Daniel Morandini

KIM Keep In Mind GmbH

25/10/2023 - 14th Workshop Computer Science Research Meets Business

www.keepinmind.info

Improving STT recognition for Bundestag Germany

By Daniel Morandini

Improving STT recognition for Bundestag Germany

Daniel Morandini

Software Developer @KIM

github.com/jecoz

Improving STT recognition for the Bundestag Germany

Solution

Results

Goals

Results

Goals

Solution

How to improve speech recognition?

Solution

Results

Goals

314k total sentence pairs

124k errorful sentences (30%)

Bundestag's Dataset in numbers

New STT engine or NLP?

Is it a translation problem?

Bundebert: a NLP model for turning German into "Bundestag German"

Release V1

How to describe its potential? What does it get right/wrong?

What did we teach?

Supported ERRANT categories with impact

Release V2

Solution

Results

Goals

Human editors reduced from 3 (pros) to 1 (student)!

UI/UX

Improving STT recognition for the Bundestag Germany

Improving STT recognition for Bundestag Germany

More from Daniel Morandini