Exploiting multimodal data fusion in robust speech recognition

Fiche du document

Date

19 juillet 2010

Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes


Sujets proches En

Talking

Citer ce document

Panikos Heracleous et al., « Exploiting multimodal data fusion in robust speech recognition », HAL-SHS : sciences de l'information, de la communication et des bibliothèques, ID : 10670/1.m26abc


Métriques


Partage / Export

Résumé En

This article introduces automatic speech recognition based on Electro-Magnetic Articulography (EMA). Movements of the tongue, lips, and jaw are tracked by an EMA device, which are used as features to create HiddenMarkovModels (HMM) and recognize speech only from articulation, that is, without any audio information. Also, automatic phoneme recognition experiments are conducted to examine the contribution of the EMA parameters to robust speech recognition. Using feature fusion, multi-stream HMM fusion, and late fusion methods, noisy audio speech has been integrated with EMA speech and recognition experiments have been conducted. The achieved results show that the integration of the EMA parameters significantly increases an audio speech recognizer's accuracy, in noisy environments.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en