Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Fiche du document

Date

19 janvier 2021

Type de document
Périmètre
Identifiants
Collection

Archives ouvertes

Licences

info:eu-repo/semantics/openAccess , info:eu-repo/semantics/openAccess




Citer ce document

Raphaël Barman et al., « Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers », Episciences.org, ID : 10.46298/jdmdh.6107


Métriques


Partage / Export

Résumé 0

The massive amounts of digitized historical documents acquired over the lastdecades naturally lend themselves to automatic processing and exploration.Research work seeking to automatically process facsimiles and extractinformation thereby are multiplying with, as a first essential step, documentlayout analysis. If the identification and categorization of segments ofinterest in document images have seen significant progress over the last yearsthanks to deep learning techniques, many challenges remain with, among others,the use of finer-grained segmentation typologies and the consideration ofcomplex, heterogeneous documents such as historical newspapers. Besides, mostapproaches consider visual features only, ignoring textual signal. In thiscontext, we introduce a multimodal approach for the semantic segmentation ofhistorical newspapers that combines visual and textual features. Based on aseries of experiments on diachronic Swiss and Luxembourgish newspapers, weinvestigate, among others, the predictive power of visual and textual featuresand their capacity to generalize across time and sources. Results showconsistent improvement of multimodal models in comparison to a strong visualbaseline, as well as better robustness to high material variance.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines