POS Tagging and Lemmatization of Historical Varieties of Languages. The Challenge of Old Italian

Fiche du document

Date

22 avril 2024

Discipline
Type de document
Périmètre
Langue
Identifiant
Source

IJCoL

Relations

Ce document est lié à :
info:eu-repo/semantics/reference/issn/2499-4553

Organisation

OpenEdition

Licences

info:eu-repo/semantics/openAccess , https://creativecommons.org/licenses/by-nc-nd/4.0/


Sujets proches En

Tagging

Citer ce document

Manuel Favaro et al., « POS Tagging and Lemmatization of Historical Varieties of Languages. The Challenge of Old Italian », IJCoL, ID : 10.4000/ijcol.1325


Métriques


Partage / Export

Résumé 0

The paper discusses the challenges of POS tagging and lemmatization of historical varieties of Italian, and reports for both tasks the results of experiments carried out in a classical supervised domain adaptation scenario using the diachronic and typologically differentiated corpus built for the "Vocabolario Dinamico dell’Italiano Moderno" (VoDIM). For what concerns POS tagging, the effectiveness of retrained models is illustrated and substantiated with quantitative data, with a specific view to linguistic annotation results obtained with respect to specific language evolution stages, domains and textual genres. For lemmatization, different customized models have been developed, including lexicon-assisted ones and models retrained with historical annotated texts. In both cases, a detailed error analysis is provided.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en