Hierarchical Text Segmentation for Medieval Manuscripts

Fiche du document

Date

8 décembre 2020

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licences

http://creativecommons.org/licenses/by/ , info:eu-repo/semantics/OpenAccess




Citer ce document

Amir Hazem et al., « Hierarchical Text Segmentation for Medieval Manuscripts », HAL-SHS : histoire, ID : 10670/1.22gxbb


Métriques


Partage / Export

Résumé En

In this paper, we address the segmentation of books of hours, Latin devotional manuscripts of the late Middle Ages, that exhibit challenging issues: a complex hierarchical entangled structure, variable content, noisy transcriptions with no sentence markers, and strong correlations between sections for which topical information is no longer sufficient to draw segmentation boundaries. We show that the main state-of-the-art segmentation methods are either inefficient or inapplicable for books of hours and propose a bottom-up greedy approach that considerably enhances the segmentation results. We stress the importance of such hierarchical segmentation of books of hours for historians to explore their overarching differences underlying conception about Church.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en