The Challenges of HTR Model Training: Feedback from the Project Donner le gout de l'archive a l'ere numerique

Fiche du document

Date

6 décembre 2023

Type de document
Périmètre
Identifiant
Collection

Archives ouvertes

Licences

info:eu-repo/semantics/openAccess , info:eu-repo/semantics/openAccess



Sujets proches En

Competence

Citer ce document

Beatrice Couture et al., « The Challenges of HTR Model Training: Feedback from the Project Donner le gout de l'archive a l'ere numerique », Episciences.org, ID : 10670/1.4p0i5i


Métriques


Partage / Export

Résumé 0

The arrival of handwriting recognition technologies offers new possibilitiesfor research in heritage studies. However, it is now necessary to reflect onthe experiences and the practices developed by research teams. Our use of theTranskribus platform since 2018 has led us to search for the most significantways to improve the performance of our handwritten text recognition (HTR)models which are made to transcribe French handwriting dating from the 17thcentury. This article therefore reports on the impacts of creating transcribingprotocols, using the language model at full scale and determining the best wayto use base models in order to help increase the performance of HTR models.Combining all of these elements can indeed increase the performance of a singlemodel by more than 20% (reaching a Character Error Rate below 5%). This articlealso discusses some challenges regarding the collaborative nature of HTRplatforms such as Transkribus and the way researchers can share their datagenerated in the process of creating or training handwritten text recognitionmodels.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en