A CRF-Based Approach to Automatic Disfluency Detection in a French Call-Centre Corpus

Fiche du document

Date

14 septembre 2014

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes



Sujets proches En

Talking

Citer ce document

Camille Dutrey et al., « A CRF-Based Approach to Automatic Disfluency Detection in a French Call-Centre Corpus », HAL-SHS : linguistique, ID : 10670/1.6a7gnc


Métriques


Partage / Export

Résumé En

In this paper, we present a Conditional Random Field based approach for automatic detection of edit disfluencies in a conversational telephone corpus in French. We define disfluency patterns using both linguistic and acoustic features to perform disfluency detection. Two related tasks are considered: the first task aims at detecting the disfluent speech portion proper or reparandum, i.e. the portion to be removed if we want to improve the readability of transcribed data ; in the second task, we aim at identifying also the corrected portion or repair which can be useful in follow-up discourse and dialogue analyses or in opinion mining. For these two tasks, we present comparative results as a function of the involved type of features (acoustic and/or linguistic). Generally speaking, best results are obtained by CRF models combining both acoustic and linguistic features.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en