Improving STT recognition for the Bundestag Germany

Andreas Unterhuber and Daniel Morandini
KIM Keep In Mind GmbH
25/10/2023 - 14th Workshop Computer Science Research Meets Business
www.keepinmind.info

Solution

Results

Goals

Results

Goals

Solution

Automatic subtitle generation, player screenshot

Humans in the loop, subtitle editor screenshot

Still too much human intervention!

How to improve speech recognition?

Solution

Results

Goals

314k total sentence pairs

124k errorful sentences (30%)

Bundestag's Dataset in numbers

(late 2022)

New STT engine or NLP?

Is it a translation problem?

Original Sentence

Corrected Sentence

align tokens and corrections

Corrections Vocabulary

Original Sentence

Corrected Sentence

find correction tags for each original token

Corrections Vocabulary

Sentence

Correction Labels

Bundebert: a NLP model for turning German into "Bundestag German"

Release V1

  • python
  • 300k samples
  • trained on on-premise GPU servers
  • torchserved

How to describe its potential? What does it get right/wrong?

What did we teach?

CATEGORY IMPACT
R:ORTH 13%
M:PUNCT 10%
R:SPELL 10%
R:NOUN 7%
U:PUNCT 7%
R:PUNCT 4%
TOTAL 51%

Supported ERRANT categories with impact

Release V2

  • python
  • 300k real + 600k synthetic sentences
  • trained on AWS sagemaker
  • evaluated with ERRANT
  • torchserved

Solution

Results

Goals

=========== Token-Based Detection ============
TP	FP	FN	Prec	Rec	F0.5
46625	24798	60177	0.6528	0.4366	0.594
==============================================

Human editors reduced from 3 (pros) to 1 (student)!

UI/UX

Improving STT recognition for the Bundestag Germany

Andreas Unterhuber and Daniel Morandini
KIM Keep In Mind GmbH
25/10/2023 - 14th Workshop Computer Science Research Meets Business
www.keepinmind.info

Improving STT recognition for Bundestag Germany

By Daniel Morandini

Improving STT recognition for Bundestag Germany

  • 95