PoliTeam @ AMI: Improving Sentence Embedding Similaritywith Misogyny Lexicons for Automatic Misogyny Identificationin Italian Tweets

Fiche du document

Date

11 mai 2021

Discipline
Périmètre
Identifiants
Collection

OpenEdition Books

Organisation

OpenEdition

Licences

https://creativecommons.org/licenses/by-nc-nd/4.0/ , info:eu-repo/semantics/openAccess



Sujets proches En

Women-hating

Citer ce document

Giuseppe Attanasio et al., « PoliTeam @ AMI: Improving Sentence Embedding Similaritywith Misogyny Lexicons for Automatic Misogyny Identificationin Italian Tweets », Accademia University Press, ID : 10.4000/books.aaccademia.6807


Métriques


Partage / Export

Résumé En It

en We present a multi-agent classification solution for identifying misogynous and aggressive content in Italian tweets. A first agent uses modern Sentence Embedding techniques to encode tweets and a SVM classifier to produce initial labels. A second agent, based on TF-IDF and Misogyny Italian lexicons, is jointly adopted to improve the first agent on uncertain predictions. We evaluate our approach in the Automatic Misogyny Identification Shared Task of the EVALITA 2020 campaign. Results show that TF-IDF and lexicons effectively improve the supervised agent trained on sentence embeddings.

Presentiamo un classificatore multi-agente per identificare tweet italiani misogini e aggressivi. Un primo agente codifica i tweet con Sentence Embedding e una SVM per produrre le etichette iniziali. Un secondo agente, basato su TF-IDF e lessici misogini, è usato per coadiuvare il primo agente nelle predizioni incerte. Applichiamo la soluzione al task AMI della campagna EVALITA 2020. I risultati mostrano che TF-IDF e i lessici migliorano le performance del primo agente addestrato su sentence embedding.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en