An accessible and transparent pipeline for publishing historical egodocuments

Résumé En

The automatization of the processing of documents oriented towards online publication and exploration by the humanities increases the rapidity of treatments like the transcription, but they should also be an opportunity to make the experimentation and the resulting corpora sustainable and reusable. The DAHN project (Dispositif de soutien à l’Archivistique et aux Humanités Numériques) relies on a joint interdisciplinary collaboration between Inria, the EHESS and the University of Le Mans. By taking the example of egodocuments, the project aims to create a ready-to-use digital and scientific publishing pipeline going from the material archive to an online publication. In this presentation, we introduce our method and guidelines for the processing of non-digital-native textual documents using open-source and easily hackable tools that guarantee visibility across an accessible pipeline, thus challenging the notions of a black box or scattered tools which tend to be hard to maintain in the long run.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en