Why Can Computers Understand Natural Language?: The Structuralist Image of Language Behind Word Embeddings

Ce document est lié à :
info:eu-repo/grantAgreement//839730/EU/Towards a Theory of Mathematical Signs Based on the Automatic Treatment of Mathematical Corpora/SemioMaths

Collection

Archives ouvertes

Organisation

Centre pour la communication scientifique directe

Licences

http://creativecommons.org/licenses/by/ , info:eu-repo/semantics/OpenAccess

Mots-clés En

Word Embeddings Natural Language Processing word2vec Neural Networks Philosophy of Language Matrix Models Distributional Hypothesis Structuralism Word Embeddings Natural Language Processing word2vec Neural Networks Philosophy of Language Matrix Models Distributional Hypothesis Structuralism

Sujets proches En

Language (New words, slang, etc.)

Citer ce document

Juan Luis Gastaldi, « Why Can Computers Understand Natural Language?: The Structuralist Image of Language Behind Word Embeddings », HAL-SHS : philosophie, ID : 10.1007/s13347-020-00393-9

Partage / Export

Résumé En

The present paper intends to draw the conception of language implied in the technique of word embeddings that supported the recent development of deep neural network models in computational linguistics. After a preliminary presentation of the basic functioning of elementary artificial neural networks, we introduce the motivations and capabilities of word embeddings through one of its pioneering models, word2vec. To assess the remarkable results of the latter, we inspect the nature of its underlying mechanisms, which have been characterized as the implicit factorization of a word-context matrix. We then discuss the ordinary association of the "distributional hypothesis" with a "use theory of meaning", often justifying the theoretical basis of word embeddings, and contrast them to the theory of meaning stemming from those mechanisms through the lens of matrix models (such as VSMs and DSMs). Finally, we trace back the principles of their possible consistency through Harris's original distributionalism up to the structuralist conception of language of Saussure and Hjelmslev. Other than giving access to the technical literature and state of the art in the field of Natural Language Processing to non-specialist readers, the paper seeks to reveal the conceptual and philosophical stakes involved in the recent application of new neural network techniques to the computational treatment of language.

Why Can Computers Understand Natural Language?: The Structuralist Image of Language Behind Word Embeddings

Fiche du document

Mots-clés En

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé En

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en