Speech Project

Week-4 Progress Report

B02901054     方為

B02901085     徐瑞陽

Training Flow

Feature Extraction

Monophone 

Equal align

即日起 簡章 備索 in lexicon.txt

Gmm align

即,起 的 '一' 拉很長 , '日' 是短音

tree

produced by gmm-init-mono?

Triphone

Convert-ali 

decision tree <-> model

output is tree and model 

same as gmm-init-mono

 

gmm-mixup

VS

gmm-est mixup

Meaning?

WFST

Viterbi

Expriment Report

# of  Iters # of gaussians accuracy (%)
40 500 54.96
40 1000 54.64
40 1500 55.79
40 2000 55.1
40 2500 53.98
40 3000 51.22
40 4000 55.9

Viterbi Triphone

change # of gaussians

# of  Iters # of gaussians accuracy (%)
40 1000 54.64
70 1000 55.90
100 1000 55.36

Viterbi Triphone

change # of iters

Viterbi Monophone

# of iters # of gauss accuracy(%)
40 1000 47.52
70 1000 45.59

FST Monophone

# of iters # of gauss accuracy(%)
40 1000 34.32

FST Triphone

# of iters # of gauss accuracy(%)
40 1000 31.86

8 hours ...= =" ,

monophone is 2 hours,

but still lower than monophone!?

Problems

we encountered

Gaussian has

too little data...OAO

The only difference

scp file for reading in feature extraction

Is it ok?

Sth weird happened

in new workstation...

Made with Slides.com