Toward a Test Set of Dislocations in Persian for Neural Machine Translation

Fiche du document

Date

18 décembre 2022

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Citer ce document

Behnoosh Namdarzadeh et al., « Toward a Test Set of Dislocations in Persian for Neural Machine Translation », HAL-SHS : linguistique, ID : 10670/1.4byrlr


Métriques


Partage / Export

Résumé En

This paper describes a test set designed to analyse the translation of dislocations from Persian, to be used for testing neural machine translation models. We first tested the accuracy of the two Universal dependency treebanks for Persian to automatically detect dislocations. Then we parsed the available Persian treebanks on GREW (Bonfante et al., 2018) to build a specific test set containing examples of dislocations. With available aligned data on OPUS (Tiedemann, 2016), we trained a model to translate from Persian into English on openNMT (Klein et al., 2017). We report the results of our translation test set by several toolkits (Google Translate, MBART-50 (Tang et al., 2020), Microsoft Bing and our in-house translation model) for the translation into English. We discuss why dislocations in Persian provide an interesting testbed for neural machine translation.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en