Speech Alignment and Recognition Experiments for Luxembourgish

Fiche du document

Date

14 mai 2014

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes



Sujets proches En

Foreign languages Languages

Citer ce document

Martine Adda-Decker et al., « Speech Alignment and Recognition Experiments for Luxembourgish », HAL-SHS : linguistique, ID : 10670/1.5lbv78


Métriques


Partage / Export

Résumé En

Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual acoustic models trained on German, French and English together with (i) “multilingual” models trained on pooled speech data from these three languages, or with (ii) native Luxembourgish acoustic models from 1200 hours of untranscribed Luxembourgish audio data using unsupervised methods. We investigated whether Luxembourgish was globally better represented by one of the individual languages, by the multilingual model or by the native (unsupervised) model. While German provides globally the best acoustic match for native Luxembourgish, detailed analyses reveal language-specific preferences, in particular English and Luxembourgish models are preferred on diphthongs. The first ASR results illustrate the accuracy of the various sets of supervised monolingual and multilingual models versus unsupervised Luxembourgish acoustic models. The ASR word error rate is progressively reduced from 60 to 25% on the development data set by unsupervised training of larger context-dependent models on increasing anounts of audio data.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en