Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries.

Fiche du document

Date

7 janvier 2022

Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1093/annweh/wxab037

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/pmid/34145882

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/eissn/2398-7316

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/urn/urn:nbn:ch:serval-BIB_24E830FACC567

Licences

info:eu-repo/semantics/openAccess , Copying allowed only for non-profit organizations , https://serval.unil.ch/disclaimer




Citer ce document

N. Savic et al., « Procode: A Machine-Learning Tool to Support (Re-)coding of Free-Texts of Occupations and Industries. », Serveur académique Lausannois, ID : 10.1093/annweh/wxab037


Métriques


Partage / Export

Résumé 0

Procode is a free of charge web-tool that allows automatic coding of occupational data (free-texts) by implementing Complement Naïve Bayes (CNB) as a machine-learning technique. The paper describes the algorithm, performance evaluation, and future goals regarding the tool's development. Almost 30 000 free-texts with manually assigned classification codes of French classification of occupations (PCS) and French classification of activities (NAF) were used to train CNB. A 5-fold cross-validation found that Procode predicts correct classification codes in 57-81 and 63-83% cases for PCS and NAF, respectively. Procode also integrates recoding between two classifications. In the first version of Procode, this operation, however, is only a simple search function of recoding links in existing crosswalks. Future focus of the project will be collection of the data to support automatic coding to other classification and to establish a more advanced method for recoding.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Exporter en