6 juillet 2022
http://creativecommons.org/licenses/by/ , info:eu-repo/semantics/OpenAccess
Bénédicte Pincemin et al., « The textometric concept of active corpus: Illustration by an analysis scenario based on annotation then projection », HAL-SHS : linguistique, ID : 10670/1.vow4gs
Active corpus provides the possibility to apply searching and statistical computing as if corpus were reduced to selected words, whereas full text still remains visible in context display. This is mainly implemented in paradigmatic processing, yet it may concern syntagmatic processing or text display too. Here we experiment active corpus in syntagmatic processing. A projection generates a new corpus, in which words are semantic tags that were automatically assigned in a first step to the original data. This new corpus makes it easy to explore tag sequences, with any generic textometric tool available, however sparse the original annotation may be. This methodological path was applied to film grammar analysis on 10,000 archival descriptions of news reports. 19 camera shot and angle types were ed through queries and tagged. This annotation became the lexicon of the projected corpus that was used to study shot sequences. The annotation and projection tools we have run are available as utilities in TXM open-sourcesoftware and should usefully serve many research projects.