Problem:
- Laughter is non-lexical but communicative

- It provides cues regarding the emotional state of speakers
- It is a heterogeneous phenomenon and highly variable, even within a single speaker

Related Work:

- Identification and acoustic characterization of laughter
- Using SVMs, GMMs and HMMs
- The acoustic analysis: F0, MFCC, PLP

Automatic Laughter Recognition using DNN

Bingyan Hu

Method:
- Pre-processing + silence slicing

- Feature Extraction: MFCC(delta, delta-delta), Pitch and Energy, Prosodic Features
- Training Set: unsupervised pre-training + NN Supervised Training (kaldi::nnet1)
- Validation Set: determine hidden layers and neural units
- Test Set: NN Classification
 

Proposal

Data:
- ICSI Meeting Corpus (LDC2004S02)

- A transcribed corpus of multi-party meeting recordings. It contains human-made annotations of non-lexical vocalized sounds including laughter, heavy breath sounds, coughs etc

- 29 Meetings, 25h, 16 subjects, 6.2% laughter
 

 

deck

By Bingyan Hu

deck

  • 520