2020
Ce document est lié à :
https://hdl.handle.net/20.500.13089/1chq
Ce document est lié à :
https://doi.org/10.4000/books.aaccademia
Ce document est lié à :
info:eu-repo/semantics/altIdentifier/isbn/979-12-80136-33-6
info:eu-repo/semantics/openAccess , https://www.openedition.org/12554
Samuel Louvan et al., « Simple Data Augmentation for Multilingual NLU in Task Oriented Dialogue Systems », Accademia University Press
Data augmentation has shown potential in alleviating data scarcity for Natural Language Understanding (e.g. slot filling and intent classification) in task-oriented dialogue systems. As prior work has been mostly experimented on English datasets, we focus on five different languages, and consider a setting where limited data are available. We investigate the effectiveness of non-gradient based augmentation methods, involving simple text span substitutions and syntactic manipulations. Our experiments show that (i) augmentation is effective in all cases, particularly for slot filling; and (ii) it is beneficial for a joint intent-slot model based on multilingual BERT, both for limited data settings and when full training data is used.