18 janvier 2024
info:eu-repo/semantics/OpenAccess
Michel Calciu et al., « From GRAPPA to RoBERTa, a huge step froward in inferring sentiments and opinions from Natural Language in Marketing, Applications to BigData from a Covid19 Tweets Collection », HALSHS : archive ouverte en Sciences de l’Homme et de la Société, ID : 10670/1.ef756f...
GRAPPA and RoBERTa are acronyms that may sound funny. The first recalls a well-known Italian drink while the other might remember “Dicke Bertha” the famous WW1 canon who was designed to blow-up 3 meters deep concrete walls. In fact, they cover two ways of inferring sentiments and opinions from Natural Language. GRAPPA is the name we gave to our GeneRal Approach for Parallel Processing Annotation a method that allowed us in a previous paper (Calciu & al, 2021) to significantly reduce computing time with lexicon-based annotations for sentiment analysis on Tweets. RoBERTa is just a variant of BERT a Transformer based deep learning technology that blue-up the “walls” of Natural Language Processing (NLP). So, the key point of this research is to demonstrate the huge step forward when inferring sentiments and opinions from lexicon-based annotations to AI Transformer based Deep Learning (DL) approaches like BERT. Besides testing the advantages of DL based contextual word embeddings over context ignoring methods that take text as a “bag of words”, we review sentiment Analysis of COVID-19-Related Twitter Data as a specialized field due to the imposed conciseness of tweets and the “disaster” represented by the pandemic. The potential of our 2.6 billion tweets collection for transfer learning is discussed with regard to the numerous contemporary state-of-the-art DL pre-trained models and labeled datasets on the subject that are freely available in specialized repositories on the Web.