A Modular and Automated Annotation Platform for Handwritings: Evaluation on Under-Resourced Languages

Fiche du document

Date

2 septembre 2021

Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1007/978-3-030-86334-0_33

Collection

Archives ouvertes




Citer ce document

Chahan Vidal-Gorène et al., « A Modular and Automated Annotation Platform for Handwritings: Evaluation on Under-Resourced Languages », HAL-SHS : histoire, ID : 10.1007/978-3-030-86334-0_33


Métriques


Partage / Export

Résumé En

There is today several approaches for automatic handwritten document analysis. HTR achieve in particular convincing results both in layout analysis and text recognition, but also in more up-to-date requests like name entity-recognition, script identification or manuscript datation. These systems are trained and evaluated with large open and specialized databases. Manual annotation and proofreading of handwritten documents is a key step to train such systems. However, it is a time-consuming task, especially when the formats required by the systems display considerable variations, or when the interfaces do not manage several level of information. We propose a new modular and collaborative interface online, ready-to-use, for multilevel annotation and quick-view solution for handwritten and printed documents, including for right-to-left languages. This interface undertakes the creation of customized projects, and the management, the conversion and the export of data in the different formats and standards of the state-of-the-art. It includes automated tasks for layout analysis and text lines extraction with high level fine-tuning capacities. We present this new interface through the case study of the creation of a database for Armenian, an under-resourced language with specific paleographical issues.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en