The textometric concept of active corpus: Illustration by an analysis scenario based on annotation then projection

Fiche du document


6 juillet 2022

Type de document

Archives ouvertes

Licences , info:eu-repo/semantics/OpenAccess

Citer ce document

Bénédicte Pincemin et al., « The textometric concept of active corpus: Illustration by an analysis scenario based on annotation then projection », HAL-SHS : histoire, ID : 10670/1.n1nrrc


Partage / Export

Résumé En

Active corpus provides the possibility to apply searching and statistical computing as if corpus were reduced to selected words, whereas full text still remains visible in context display. This is mainly implemented in paradigmatic processing, yet it may concern syntagmatic processing or text display too. Here we experiment active corpus in syntagmatic processing. A projection generates a new corpus, in which words are semantic tags that were automatically assigned in a first step to the original data. This new corpus makes it easy to explore tag sequences, with any generic textometric tool available, however sparse the original annotation may be. This methodological path was applied to film grammar analysis on 10,000 archival descriptions of news reports. 19 camera shot and angle types were ed through queries and tagged. This annotation became the lexicon of the projected corpus that was used to study shot sequences. The annotation and projection tools we have run are available as utilities in TXM open-sourcesoftware and should usefully serve many research projects.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en