Developing Resources for Automated Speech Processing of Quebec French

Fiche du document

Date

2020

Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Sujets proches En

Talking

Citer ce document

Mélanie Lancien et al., « Developing Resources for Automated Speech Processing of Quebec French », HAL-SHS : sciences de l'information, de la communication et des bibliothèques, ID : 10670/1.2ldiub


Métriques


Partage / Export

Résumé En

The analysis of the structure of speech nearly always rests on the alignment of the speech recording with a phonetic transcription. Nowadays several tools can perform this speech segmentation automatically. However, none of them carries out the automatic segmentation of Quebec French (QF hereafter) in a proper way. Contrary to what could be assumed, the acoustics and phonotactics of QF differs widely from that of France French (FF hereafter). To adequately segment QF, features like diphthongization of long vowels and affrication of coronal stops have to be taken into account. Thus acoustic models for automatic segmentation must be trained on speech samples exhibiting those phenomena. Dictionaries and lexicons must also be adapted and integrate differences in lexical units (such as very frequent words in QF that are not used in FF) and in the phonology of QF (such as the existence of tense and lax high vowels in QF but not in FF). This paper presents the development of linguistic resources to be included into the SPPAS software tool in order to get Text normalization, Phonetization, Alignment and Syllabification. We adapted the existing French lexicon and developed a QF-specific pronunciation dictionary. We then created an acoustic model from the existing ones and adapted it with 5 minutes of manually time-aligned data. These new resources are all freely distributed with SPPAS version 2.7; they perform the full process of speech segmentation in Quebec French.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en