Efficient bilingual lexicon extraction from comparable corpora based on formal concepts analysis

Fiche du document

Discipline
Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1017/S135132492100022X

Collection

Archives ouvertes




Citer ce document

Mohamed Chebel et al., « Efficient bilingual lexicon extraction from comparable corpora based on formal concepts analysis », HAL-SHS : linguistique, ID : 10.1017/S135132492100022X


Métriques


Partage / Export

Résumé En

Bilingual corpora are an essential resource used to cross the language barrier in multilingual natural language processing tasks. Among bilingual corpora, comparable corpora have been the subject of many studies as they are both frequent and easily available. In this paper, we propose to make use of formal concept analysis to first construct concept vectors which can be used to enhance comparable corpora through clustering techniques. We then show how one can extract bilingual lexicons of improved quality from these enhanced corpora. We finally show that the bilingual lexicons obtained can complement existing bilingual dictionaries and improve cross-language information retrieval systems.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en