The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings

Fiche du document

Date

26 juin 2018

Discipline
Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.5281/zenodo.1012567

Ce document est lié à :
info:eu-repo/grantAgreement//679528/EU/Poetry Standardization and Linked Open Data/POSTDATA

Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Sujets proches En

Enjambment

Citer ce document

Pablo Ruiz et al., « The Diachronic Spanish Sonnet Corpus (DISCO): TEI and Linked Open Data Encoding, Data Distribution and Metrical Findings », HAL-SHS : littérature, ID : 10.5281/zenodo.1012567


Métriques


Partage / Export

Résumé En

We present a corpus covering 4094 sonnets in Spanish by 1204 authors, from the 15th to the 19th centuries, extracted from HTML sources. The corpus was encoded in TEI. Author metadata not available in a standardized format in the sources were systematically retrieved or inferred from the sources and added to the corpus, e.g. author gender or VIAF IDs. RDFa was used to render TEI semantics in the Linked Open Data paradigm. Scansion was annotated automatically with the ADSO Scansion System. Enjambment was annotated automatically with our enjambment detection tool (ANJA). Stanza types were also annotated. The corpus covers both canonical and non-canonical authors, from Europe and Latin America. The range of authors and periods, the use of both TEI and RDFa for interoperability, and the combination of metrical and enjambment annotations goes beyond previously available digital resources for the study of poetry in Spanish. This corpus is a contribution within an area where digital resources are scarce. We also present some literary analysis results that illustrate the type of research questions that can be answered with the corpus.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en