From GRAPPA to RoBERTa, a huge step froward in inferring sentiments and opinions from Natural Language in Marketing, Applications to BigData from a Covid19 Tweets Collection

Fiche du document

Date

18 janvier 2024

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Citer ce document

Michel Calciu et al., « From GRAPPA to RoBERTa, a huge step froward in inferring sentiments and opinions from Natural Language in Marketing, Applications to BigData from a Covid19 Tweets Collection », HALSHS : archive ouverte en Sciences de l’Homme et de la Société, ID : 10670/1.ef756f...


Métriques


Partage / Export

Résumé En

GRAPPA and RoBERTa are acronyms that may sound funny. The first recalls a well-known Italian drink while the other might remember “Dicke Bertha” the famous WW1 canon who was designed to blow-up 3 meters deep concrete walls. In fact, they cover two ways of inferring sentiments and opinions from Natural Language. GRAPPA is the name we gave to our GeneRal Approach for Parallel Processing Annotation a method that allowed us in a previous paper (Calciu & al, 2021) to significantly reduce computing time with lexicon-based annotations for sentiment analysis on Tweets. RoBERTa is just a variant of BERT a Transformer based deep learning technology that blue-up the “walls” of Natural Language Processing (NLP). So, the key point of this research is to demonstrate the huge step forward when inferring sentiments and opinions from lexicon-based annotations to AI Transformer based Deep Learning (DL) approaches like BERT. Besides testing the advantages of DL based contextual word embeddings over context ignoring methods that take text as a “bag of words”, we review sentiment Analysis of COVID-19-Related Twitter Data as a specialized field due to the imposed conciseness of tweets and the “disaster” represented by the pandemic. The potential of our 2.6 billion tweets collection for transfer learning is discussed with regard to the numerous contemporary state-of-the-art DL pre-trained models and labeled datasets on the subject that are freely available in specialized repositories on the Web.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines