2024
info:eu-repo/semantics/OpenAccess
Helena Bermúdez Sabel et al., « Procesamiento del lenguaje natural y fijación del texto. Experiencias en torno a la constitución de un corpus diacrónico de sonetos », HAL-SHS : littérature, ID : 10670/1.g2cd0i
We present work carried out within the development of DISCO, the Diachronic Spanish Sonnet Corpus project, which consists of 4,530 sonnets in Spanish from Europe, Latin America and the Philippines, including texts from the 15th to the 20th centuries. The resource offers versification annotations obtained automatically through tools based on Natural Language Processing (NLP). In this article, we present how automatic annotation results can be exploited to detect textual transmission errors. Drawing on our experience with DISCO, we present observations towards the creation of workflows assisted by NLP-based tools, which can help detect possible textual errors, thus allowing us to focus on specific passages for our manual correction effort.