OpenMethods introduction to: An end-to-end approach for extracting and segmenting high-variance references from pdf documents

Fiche du document

Type de document
Périmètre
Langue
Identifiants
  • handle:  10670/1.rh0mow
  • https://openmethods.dariah.eu/2020/11/09/an-end-to-end-approach-for-extracting-and-segmenting-high-variance-references-from-pdf-documents-proceedings-of-the-18th-joint-conference-on-digital-libraries/
Organisation

DARIAH




Citer ce document

Stefan Karcher, « OpenMethods introduction to: An end-to-end approach for extracting and segmenting high-variance references from pdf documents », OpenMethods: Highlighting Digital Humanities Methods and Tools, ID : 10670/1.rh0mow


Métriques


Partage / Export

Résumé 0

Introduction: Digital text analysis depends on one important thing: text that can be processed with little effort. Working with PDFs often leads to great difficulties, as Zeyd Boukhers Shriharsh Ambhore and Steffen Staab describe in their paper. Their goal is to extract references from PDF documents. Highlight of their described workflow are very impressive precision rates. The paper thereby encourages to a further development of the process and its application as a "method" in the humanities.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en