A semi-supervised Learning Approach to find equivalent long-string Organization Names

Fiche du document

Date

11 octobre 2016

Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Citer ce document

Frédérique Bordignon et al., « A semi-supervised Learning Approach to find equivalent long-string Organization Names », HAL-SHS : linguistique, ID : 10670/1.3auydy


Métriques


Partage / Export

Résumé En

Background: A platform called Opalia has been built to propose free access to all publications about a laboratory for a given range of years. This platform makes indexing of a corpus of a scientific article of a given lab. But in the French research system, a lab includes researchers from different organizations in the same unit generally called. UMR. Authors can write their laboratory names differently. Aim: Sorting a set of labels that is noisy can be seen as a binary classification into positives and leave negatives strings. We propose to use a cascade processing with the help of tagging some positive strings to build a relevant space of features that helps classification into good labels.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en