Domain Adaptation for Text Classification with Weird Embeddings

Computational Linguistics Fine-grained sentiment analysis Distributional Semantics Quantitative Linguistic Investigations Gender Bias Depression from Social Media Online Hate Speech Automatic Sarcasm Detection TrAVaSI AriEmozione AEREST COVID-19 Linguistic Ostracism in Social Networks Multilingual NLU E3C Project DistilBERT Twitter during Pandemic

Sujets proches En

Knowledge, Classification of

Citer ce document

Valerio Basile, « Domain Adaptation for Text Classification with Weird Embeddings », Accademia University Press, ID : 10.4000/books.aaccademia.8250

Partage / Export

Résumé 0

Pre-trained word embeddings are often used to initialize deep learning models for text classification, as a way to inject precomputed lexical knowledge and boost the learning process. However, such embeddings are usually trained on generic corpora, while text classification tasks are often domain-specific. We propose a fully automated method to adapt pre-trained word embeddings to any given classification task, that needs no additional resource other than the original training set. The method is based on the concept of word weirdness, extended to score the words in the training set according to how characteristic they are with respect to the labels of a text classification dataset. The polarized weirdness scores are then used to update the word embeddings to reflect task-specific semantic shifts. Our experiments show that this method is beneficial to the performance of several text classification tasks in different languages.

Domain Adaptation for Text Classification with Weird Embeddings

Fiche du document

Mots-clés En Und

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en