Lexicon induction and part-of-speech tagging of non-resourced languages without any bilingual resources

Fiche du document

Date

13 septembre 2013

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Citer ce document

Yves Scherrer et al., « Lexicon induction and part-of-speech tagging of non-resourced languages without any bilingual resources », HAL-SHS : linguistique, ID : 10670/1.zr02w9


Métriques


Partage / Export

Résumé En

We introduce a generic approach for transferring part-of-speech annotations from a resourced language to a non-resourced but etymologically close language. We first infer a bilingual lexicon between the two languages with methods based on character similarity, frequency similarity and context similarity. We then assign part-of-speech tags to these bilingual lexicon entries and annotate the remaining words on the basis of suffix analogy. We evaluate our approach on five language pairs of the Iberic peninsula, reaching up to 95% of precision on the lexicon induction task and up to 85% of tagging accuracy.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en