Investigating Continued Pretraining for Zero-Shot Cross-Lingual Spoken Language Understanding

Samuel Louvan; Silvia Casola; Bernardo Magnini

Investigating Continued Pretraining for Zero-Shot Cross-Lingual Spoken Language Understanding

Fiche du document

Auteurs

Date

2022

Discipline

Linguistique

Type de document

Livres et chapitres d'ouvrages

Périmètre

Publications

Langue

Anglais

Identifiant

20.500.13089/1dp8

Source

Accademia University Press

Relations

Ce document est lié à :
https://hdl.handle.net/20.500.13089/1chx

Ce document est lié à :
https://doi.org/10.4000/books.aaccademia

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/isbn/979-12-80136-94-7

Collection

OpenEdition Books

Organisation

OpenEdition

Licences

info:eu-repo/semantics/openAccess , https://creativecommons.org/licenses/by-nc-nd/4.0/

Sujets proches En

Language (New words, slang, etc.) Foreign languages Languages

Citer ce document

Samuel Louvan et al., « Investigating Continued Pretraining for Zero-Shot Cross-Lingual Spoken Language Understanding », Accademia University Press

Partage / Export

Résumé 0

Spoken Language Understanding (SLU) in task-oriented dialogue systems involves both intent classification (IC) and slot filling (SF) tasks. The de facto method for zero-shot cross-lingual SLU consists of fine-tuning a pretrained multilingual model on English labeled data before evaluating the model on unseen languages. However, recent studies show that adding a second pretraining stage (continued pretraining) can improve performance in certain settings. This paper investigates the effectiveness of continued pretraining on unlabeled spoken language data for zero-shot cross-lingual SLU. We demonstrate that this relatively simple approach benefits either SF and IC task across 8 target languages, especially the ones written in Latin script. We also find that discrepancy between languages used during pretraining and fine-tuning may introduce training instability, which can be alleviated through code-switching.1

Investigating Continued Pretraining for Zero-Shot Cross-Lingual Spoken Language Understanding

Fiche du document

Sujets proches En

Citer ce document

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines