Using corpora for post-editing neural MT in highly specialised domains: The case of complex noun phrases L'exploitation des corpus pour post-éditer la TA neuronale dans des domaines hyper-spécialisés: Le cas des groupes nominaux complexes En Fr

Fiche du document

Date

9 septembre 2021

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licences

http://creativecommons.org/licenses/by/ , info:eu-repo/semantics/OpenAccess



Citer ce document

Natalie Kübler et al., « L'exploitation des corpus pour post-éditer la TA neuronale dans des domaines hyper-spécialisés: Le cas des groupes nominaux complexes », HAL SHS (Sciences de l’Homme et de la Société), ID : 10670/1.bf1c12...


Métriques


Partage / Export

Résumé En

This study investigates the impact of neural machine translation (MT) on the translation of complex noun phrases (CNPs) in highly specialised domains and examines how specialised comparable corpora can help in the post-editing process. The research aims to assess MT performance on CNPs, evaluate the adequacy of our translation error typology for MT output in realtion with post-editing, and determine whether corpus use enhances post-editing efficiency.Within our established corpus-based translation training framework, we have integrated post-editing as a key component, following the European Master’s in Translation (EMT) Competence Framework (2017). To assess corpus use in post-editing, we compare four translation versions produced by students: (a) human translation using only general resources, (b) a revised version using corpora, (c) post-edited MT output, and (d) a corpus-enhanced version of the post-edited text. Through error annotation and student feedback, we analyze MT’s influence on translation accuracy and students’ editing behaviors.Preliminary results indicate that MT-generated solutions for CNPs often require revision. While some errors involve simple lexical substitutions, others demand structural reconfiguration, which students tend to either over-edit or under-edit based on word familiarity. Comparable corpora prove useful in guiding post-editing decisions, reinforcing their pedagogical value for LSP translation training. This study contributes to corpus-based translation research and highlights the need for targeted training in post-editing methodologies.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines