Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling

Computational Linguistics Fine-grained sentiment analysis Distributional Semantics Quantitative Linguistic Investigations Gender Bias Depression from Social Media Online Hate Speech Automatic Sarcasm Detection TrAVaSI AriEmozione AEREST COVID-19 Linguistic Ostracism in Social Networks Multilingual NLU E3C Project DistilBERT Twitter during Pandemic

Sujets proches En

Skills training Speech, Hate Group libel Group defamation Defamation against groups Racist speech

Citer ce document

Camilla Casula et al., « Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling », Accademia University Press, ID : 10.4000/books.aaccademia.8345

Partage / Export

Résumé 0

While using machine-translated data for supervised training can alleviate data sparseness problems when dealing with less-resourced languages, it is important that the source data are not only correctly translated, but also follow the same annotation scheme and possibly class balance as the smaller dataset in the target language. We therefore present an evaluation of hate speech detection in Italian using machine-translated data from English and comparing three settings, in order to understand the impact of training size, class distribution and annotation scheme.

Hate Speech Detection with Machine-Translated Data: The Role of Annotation Scheme, Class Imbalance and Undersampling

Fiche du document

Mots-clés En Und

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en