1997
Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1109/ASRU.1997.659133
info:eu-repo/semantics/OpenAccess
Brigitte Bigi et al., « Combined models for topic spotting and topic-dependent language modeling », HAL-SHS : sciences de l'information, de la communication et des bibliothèques, ID : 10.1109/ASRU.1997.659133
A new statistical method for Language Modeling and spoken document classification is proposed. It is based on a mixture of topic dependent probabilities. Each topic dependent probability is in turn a mixture of n-gram probabilities and the probability of Kullback-Lieber (KL) distances between keyword unigrams and distribution obtained from the content of a cache memory. Experimental result on topic classification using a corpus of 60 Mword from the French newspaper Le Monde show the excellent performance of the cache memory and its complementary role in providing different statistics for the decision process.