ISSA Pipeline

Fiche du document

Date

27 novembre 2024

Types de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
https://doi.org/10.5281/zenodo.10381606

Ce document est lié à :
https://hal.science/hal-03807744v1

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.5281/zenodo.1423046

Collection

Archives ouvertes

Licence

http://www.apache.org/licenses/LICENSE-2.0




Citer ce document

Anna Bobasheva et al., « ISSA Pipeline », HAL SHS (Sciences de l’Homme et de la Société), ID : 10.5281/zenodo.1423046


Métriques


Partage / Export

Résumé 0

The ISSA pipeline was developed by the ISSA project (https://issa.cirad.fr/) . It orchestrates the automatic indexing of a scientific archive by extracting from the articles full-text thematic descriptors and named entities, and linking them with terminological resources in the Semantic Web format.The repository consists of various tools, scripts and configuration files involved in each step of the pipeline:- retrieve the articles metadata from the archive's API;- download and pre-process the PDF files of the articles;- process the output to extract thematic descriptors and named entities;- translate the output of each processing step into a unified, consistent RDF dataset;- retrieve additional metadata from OpenAlex: topics, Sustainable Devlopment Goals (SDG), authorship with institutions- upload the resulting dataset to a triple store equipped with a SPARQL endpoint.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines