ParCoLab: A Parallel Corpus for Serbian, French and English

Fiche du document

Date

29 juillet 2017

Discipline
Périmètre
Langue
Identifiants
Collection

Archives ouvertes



Sujets proches En

Frenchmen (French people)

Citer ce document

Aleksandra Miletic et al., « ParCoLab: A Parallel Corpus for Serbian, French and English », HAL-SHS : linguistique, ID : 10670/1.ee1e8x


Métriques


Partage / Export

Résumé En

ParCoLab is a trilingual parallel corpus containing texts in Serbian, French and English. It is developed at the CLLE-ERSS research unit (UMR 5263 CNRS) at the University of Toulouse, France, in collaboration with the Department of Romance Studies at the University of Belgrade, Serbia. Serbian being one of the less-resourced European languages, this is an important step towards the creation of freely accessible corpora and NLP tools for this language. Our main goal is to provide the scientific community with a high-quality resource that can be used in a wide range of applications, such as contrastive linguistic studies, NLP research, machine and computer assisted translation, translation stud- ies, second language learning and teaching, and applied lexicography. The corpus currently contains 7.1M tokens mainly from literary works, but corpus extension and diversification efforts are ongoing. ParCoLab can be queried online and a part of it is available for download.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en