Discovery of usage patterns in digital library web logs using Markov modeling

Fiche du document

Date

12 juillet 2019

Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Citer ce document

Adrien Nouvellet et al., « Discovery of usage patterns in digital library web logs using Markov modeling », HAL-SHS : sociologie, ID : 10670/1.rdc3lv


Métriques


Partage / Export

Résumé 0

This paper proposes a family of tools based on Markov modeling to quantitatively analyze how people access the digital collections of the Bibliothèque nationale de France (BnF, the national library of France), through the web platform called Gallica. The aim is to provide the BnF with relevant information about the various usage patterns to help them to better understand their users, improve the mediation efforts and the design of the website, in order to increase the general public use of the 4M-documents collection. For that purpose, the study focuses on the access logs retrieved from the Apache HTTP servers of Gallica that are converted into sequences of actions. In order to study user navigation behaviors, we propose to model the access log data using Markov Models, whether it be Markov chains when considering sequences of actions without duration, or Markov processes when taking into account duration. Our models are either used to capture an average behavior through meaningful statistics or to cluster the data to exhibit various interpretable types of usage. The numerical results bring new insights on the way the users interact with the platform, highlighting the mean duration of some actions such as the interaction with the search engine or the consultation of documents. Even if our approach requires the use of additional information in order to properly interpret the models and the correlations that it highlights, it is able to discover all types of behaviors, including the stealthiest and the most difficult to capture in traditional surveys, giving them their fair weight in terms of audience. We also show how this approach fits into a broader work combining data mining and ethnography.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en