Predicting unseen articulations from multi-speaker articulatory models

Gopal Ananthakrishnan et al., « Predicting unseen articulations from multi-speaker articulatory models », HAL-SHS : sciences de l'information, de la communication et des bibliothèques, ID : 10670/1.e3a7sm

Partage / Export

Résumé En

In order to study inter-speaker variability, this work aims to assess the generalization capabilities of data-based multi-speaker articulatory models. We use various three-mode factor analysis techniques to model the variations of midsagittal vocal tract contours obtained from MRI images for three French speakers articulating 73 vowels and consonants. Articulations of a given speaker for phonemes not present in the training set are then predicted by inversion of the models from measurements of these phonemes articulated by the other subjects. On the average, the prediction RMSE was 5.25 mm for tongue contours, and 3.3 mm for 2D midsagittal vocal tract distances. Besides, this study has established a methodology to determine the optimal number of factors for such models.

Predicting unseen articulations from multi-speaker articulatory models

Fiche du document

Mots-clés En

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé En

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en