Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Computer Science - Computer Vision and Pattern Recognition Computer Science - Computation and Language Computer Science - Information Retrieval Computer Science - Machine Learning

Sujets proches En Fr

informatisation

Citer ce document

Raphaël Barman et al., « Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers », Episciences.org, ID : 10.46298/jdmdh.6107

Partage / Export

Résumé 0

The massive amounts of digitized historical documents acquired over the lastdecades naturally lend themselves to automatic processing and exploration.Research work seeking to automatically process facsimiles and extractinformation thereby are multiplying with, as a first essential step, documentlayout analysis. If the identification and categorization of segments ofinterest in document images have seen significant progress over the last yearsthanks to deep learning techniques, many challenges remain with, among others,the use of finer-grained segmentation typologies and the consideration ofcomplex, heterogeneous documents such as historical newspapers. Besides, mostapproaches consider visual features only, ignoring textual signal. In thiscontext, we introduce a multimodal approach for the semantic segmentation ofhistorical newspapers that combines visual and textual features. Based on aseries of experiments on diachronic Swiss and Luxembourgish newspapers, weinvestigate, among others, the predictive power of visual and textual featuresand their capacity to generalize across time and sources. Results showconsistent improvement of multimodal models in comparison to a strong visualbaseline, as well as better robustness to high material variance.

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

Fiche du document

Mots-clés Und

Sujets proches En Fr

Citer ce document

Métriques

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines