23 mars 2014
Ce document est lié à :
info:eu-repo/semantics/reference/issn/2197-7682
info:eu-repo/semantics/openAccess
Christof Schöch, « Enrichment by Elimination, or: How to turn HTML into simple TEI using Python », The Dragonfly's Gaze, ID : 10.58079/nwf0
There are lots of full text repositories of literary works out there, be it the venerable Project Gutenberg (founded in 1971, when the internet was just a few dozen computers), a pioneer like Gallica (with increasing amounts of plain text in the 90-95% correct OCR range), or a crowdsourced efforts like Wikisource (with nifty quality indicators). Closer to my geographical location are initiatives like TextGrid's Digitale Bibliothek and the Deutsches Textarchiv (both very professional and acad...