FAIR Vocabularies in Population Research: report of the IUSSP-CODATA Working Group on FAIR Vocabularies

George Alter; Arofan Gregory; Steven Mceachern; Darren Bell; Derek Burke; Robert Chen; Alessio Cardacino; Nada Chaya; David Barraclough; Rowan Brownlee; Tom Emery; Patrick Gerland; Cristina Giudici; Abdulla Gozalov; Edgardo Greising; Sanda Ionescu; Taina Jääskeläinen; Chifundo Kanjala; Vladimira Kantorova; Joseph Larmarange; Pablo Lattes; Jared Lyle; Diana Magnuson; Melissa Meinhart; Santosh Kumar Mishra; Romesh Silva; Thomas Spoorenberg; Philipp Ueffing; Jay Winkler

FAIR Vocabularies in Population Research: report of the IUSSP-CODATA Working Group on FAIR Vocabularies

Fiche du document

Auteurs

Date

janvier 2023

Type de document

Rapports

Périmètre

Publications

Langue

Anglais

Identifiants

Source

HALSHS : archive ouverte en Sciences de l’Homme et de la Société - notices sans texte intégral

Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.5281/zenodo.7818156

Collection

Archives ouvertes

Organisation

Centre pour la communication scientifique directe

Licences

http://creativecommons.org/licenses/by/ , info:eu-repo/semantics/OpenAccess

Sujets proches En

Conation Volition Cetanā

Citer ce document

George Alter et al., « FAIR Vocabularies in Population Research: report of the IUSSP-CODATA Working Group on FAIR Vocabularies », HALSHS : archive ouverte en Sciences de l’Homme et de la Société - notices sans texte intégral, ID : 10.5281/zenodo.7818156

Partage / Export

Résumé 0

This report describes the role of controlled vocabulariesin the documentation and dissemination of demographicdata in the light of the FAIR principles that all datashould be “Findable, Accessible, Interoperable, andReusable” by both humans and machines (Wilkinson etal., 2016). Population research is an empirically focusedfield with a long tradition of widely shared, easilyaccessible, data collections. The FAIR Principles pointto ways that this tradition can be enhanced by takingadvantage of emerging standards and technologies.Our work builds on the “Ten Simple Rules for makinga vocabulary FAIR” (Cox et al., 2021), prepared by agroup formed at a workshop convened by CODATA andDDI to describe how a FAIR vocabulary will work withinternational standards for documenting and sharingsocial science data.Controlled vocabularies play a central role in datasharing by associating data with concepts and bydefining which categories or codes may be applied.FAIR vocabularies specify globally accessible persistentidentifiers to distinguish data items that are the samefrom those that are different. Consider the most basicvariable in demographic analysis: age. The Organizationfor Economic Cooperation and Development (OECD)has a list of 643 age categories, while the UN PopulationDivision copes with more than 1100 age groups. If themeanings of variables in a dataset are only availablethrough human-readable documentation, like a pdf,harmonizing data from two providers will remain atedious manual process. However, if the age categoriesare linked to persistent identifiers in machine actionablemetadata, software can be programmed to harmonizeage groupings. If these operations are performedacross dozens of variables in hundreds of data sources,enormous amounts of human time will be saved.Construction of the infrastructure for FAIR data hasbegun. Demographic concepts are already includedin vocabularies developed by other disciplines, likemedicine, with definitions that conflict with usage inpopulation research. Therefore, there is a need fora FAIR vocabulary of demographic conceptsendorsed by an authoritative institution in thefield of population science.IUSSP has a long history of working with the UNand other agencies to define demographic concepts(International Union for the Scientific Study ofPopulation, 1954; Vincent, 1953). Those efforts currentlyexist in electronic forms (Demopædia and Demovoc)that provide a base for a multilingual FAIR Vocabularyof Demography. We argue that a FAIR Vocabularyof Demography will have important benefits for thepopulation research community represented by IUSSP,and we conclude with recommendations for IUSSP andother important organizations.In addition to summarizing the activities of the WorkingGroup, this report is intended to serve as an introductionto the standards and infrastructure used to share socialscience data. Most demographers have never heard ofURIs, SDMX, or DDI, even though they use servicesfrom the UN, ILO, OECD, CESSDA, IPUMS, andother organizations that depend on these standards.Understanding key features of the international datainfrastructure will help IUSSP leadership to influenceits development.

FAIR Vocabularies in Population Research: report of the IUSSP-CODATA Working Group on FAIR Vocabularies

Fiche du document

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets