A Diachronic Italian Corpus based on “L’Unità”

Fiche du document

Date

3 septembre 2021

Discipline
Périmètre
Langue
Identifiants
Collection

OpenEdition Books

Organisation

OpenEdition

Licences

https://www.openedition.org/12554 , info:eu-repo/semantics/openAccess



Sujets proches En

Hours (Time)

Citer ce document

Pierpaolo Basile et al., « A Diachronic Italian Corpus based on “L’Unità” », Accademia University Press, ID : 10.4000/books.aaccademia.8245


Métriques


Partage / Export

Résumé 0

In this paper, we describe the creation of a diachronic corpus for Italian by exploiting the digital archive of the newspaper “L’Unità”. We automatically clean and annotate the corpus with PoS tags, lemmas, named entities and syntactic dependencies. Moreover, we compute frequency-based time series for tokens, lemmas and entities. We show some interesting corpus statistics taking into account the temporal dimension and describe some examples of usage of time series.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en