16 octobre 2023
Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.46298/jdmdh.10416
Matthias Gille Levenson, « Towards a general open dataset and model for late medieval Castilian text recognition (HTR/OCR) », HAL-SHS : histoire, ID : 10.46298/jdmdh.10416
This paper introduces a first HTR/OCR open access gold corpus for spanish late medieval sources, basedon the allographetic transcription of more than 300 pages of several manuscripts of the Regimiento de losPrínçipes, as well as a first set of general transcription and regions/lines segmentation models trained withKraken. These models are evaluated with in-domain and out-of-domain data.