September 12, 2017
Nicolas Ballier et al., « R-based strategies for DH in English Linguistics: a case study », Hyper Article en Ligne - Sciences de l'Homme et de la Société, ID : 10670/1.34gts2
This paper is a position statement advocating the implementation of the programming language R in a curriculum of English Linguistics. This is an illustration of a possible strategy for the requirements of Natural Language Processing (NLP) for Digital Humanities (DH) studies in an established curriculum. R plays the role of a Trojan Horse for NLP and statistics, while promoting the acquisition of a programming language. We report an overview of existing practices implemented in an MA and PhD programme at the University of Paris Diderot in the recent years. We emphasize completed aspects of the curriculum and detail existing teaching strategies rather than work in progress but our last section alludes to work still under way, such as getting PhD students to write their own R packages. We describe our strategy, discuss better practices and teaching concepts, and present experiments in a curriculum. We express the needs of an initially limited NLP environment and provide directions for future DH curricular developments. We detail the challenges in teaching a non-CL audience, showing that some software suites can be integrated to a curriculum, outlining how some specific R packages contribute to the acquisition of NLP-based techniques and favour the awareness of the needs for statistical modelling .