Exploiting weak-supervision for detecting and classifying Mandarin Non-Sentential Utterances in Conversations

Fiche du document

Auteur
Date

2020

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess




Citer ce document

Chen Xin-Yi, « Exploiting weak-supervision for detecting and classifying Mandarin Non-Sentential Utterances in Conversations », Dépôt Universitaire de Mémoires Après Soutenance, ID : 10670/1.ztxjvn


Métriques


Partage / Export

Résumé 0

In conversations, there are pervasive short utterances in incomplete syntactic form but that present no difficulty for understanding, these “non-sentential utterances” (NSU) are our target of research in this study. They are short yet efficient in the conversation flow. The interpretation of NSU is relevant for linguistic research as well as for potential industrial applications, like dialogue system. The main tasks they are concerned with are their detection and their classification. We hope to build a model to classify NSUs automatically. In machine learning, one of the difficulties for building a model is a lack of training data. In this study, we adopt aw eak supervision tool, “Snorkel” to perform the automatic classification for the NSUs. To understand NSU classification, we discuss related literature of dialogue act because it is providing important information about the utterance, and the way their classification is approached in computational approaches is very similar to NSU classification. Also, we look at disfluency because they are short and can be confused with NSU. Equipped with descriptive statistics of our data and qualitative analysis of NSU categories in our corpus, we attempted both NSU detection and classification within SNORKEL framework. We use a set of features based on our corpus in the writing of rules to attribute labels to the data. We can adjust our model based on the performance parameters and error analysis. We summarised the results of our experiments and point directions for future work.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en