Measuring Language Development From Child-centered Recordings

Fiche du document

Date

20 août 2023

Discipline
Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.21437/Interspeech.2023-1569

Ce document est lié à :
info:eu-repo/grantAgreement//101001095/EU/Experience Effects on early language acquisition/ExELang

Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess




Citer ce document

Yaya Sy et al., « Measuring Language Development From Child-centered Recordings », HAL-SHS : linguistique, ID : 10.21437/Interspeech.2023-1569


Métriques


Partage / Export

Résumé En

Standard ways to measure child language development from spontaneous corpora rely on detailed linguistic descriptions of a language as well as exhaustive transcriptions of the child's speech, which today can only be done through costly human labor. We tackle both issues by proposing (1) a new language development metric (based on entropy) that does not require linguistic knowledge other than having a corpus of text in the language in question to train a language model, (2) a method to derive this metric directly from speech based on a smaller text-speech parallel corpus. Here, we present descriptive results on an open archive including data from six Englishlearning children as a proof of concept. We document that our entropy metric documents a gradual convergence of children's speech towards adults' speech as a function of age, and it also correlates moderately with lexical and morphosyntactic measures derived from morphologically-parsed transcriptions. The source code of the experiments is released at https:// github.com/yaya-sy/EntropyBasedCLDMetrics

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en