An Online Community of Data Enthusiasts Collaborates to Seek, Share, and Make Sense of Data. Stvilia, B., & Gibradze, L. (2022). Seeking and sharing datasets in an online community of data enthusiasts. Library & Information Science Research 44(3). https://doi.org/10.1016/j.lisr.2022.101160

Fiche du document

Date

2023

Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
Evidence Based Library and Information Practice ; vol. 18 no. 1 (2023)

Collection

Erudit

Organisation

Consortium Érudit

Licence

©, 2023JordanPatterson


Sujets proches En

Community

Citer ce document

Jordan Patterson, « An Online Community of Data Enthusiasts Collaborates to Seek, Share, and Make Sense of Data. Stvilia, B., & Gibradze, L. (2022). Seeking and sharing datasets in an online community of data enthusiasts. Library & Information Science Research 44(3). https://doi.org/10.1016/j.lisr.2022.101160 », Evidence Based Library and Information Practice, ID : 10.18438/eblip30280


Métriques


Partage / Export

Résumé 0

Objective – To understand the major activities, tools, sources, and challenges of online communities focused on datasets.Design – Content analysis informed by activity theory.Setting – The r/Datasets subreddit, a web forum for sharing, seeking, and discussing datasets.Subjects – 1232 “hot” or “top” discussion threads (1232 original posts and 6813 responding comments) first posted between 2010 and 2020.Methods – The researchers used Reddit’s API to collect their sample of threads. Using a random subset of the sample, the researchers developed a coding scheme for content analysis, which identified major themes in the data. Through this process, they controlled for quality: each researcher coded half the subset independently, then together evaluated their intercoder reliability and discussed and resolved disagreements. The researchers also employed labelled latent Dirchlet allocation to construct topic models corresponding to the theme’s manual content analysis, which produced profiles of the top 100 terms most likely to appear in that topic. Finally, the researchers extracted URLs from threads in the sample to ascertain types of information and data sources used by the community. Presenting their findings, the researchers discussed notable themes and proposed a metadata model for describing datasets, the Data Q&A metadata (DQAM) model.Main Results – The r/Datasets community engages in three distinct activities: asking and answering questions, disseminating information, and community building. The closely related Q&A and dissemination activities shared themes of obtaining and aggregating data, sensemaking, collaborating and crowdsourcing, and data evaluation. Community members frequently discussed tools, competencies, and sources for data work. Major challenges for members of the community related to the general themes of data quality, accessibility, ethics, and legality. A proposed 16-element metadata schema should meet the needs of data enthusiasts.Conclusion – The content analysis reveals a dedicated community engaged in an array of data-seeking and data-sharing activities. Data producers should be mindful of how their data can be accessed and used outside of their original professional or scholarly contexts.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en