Modeling Noun-Phrases Dynamics in Specialized Text Collections

Fiche du document

Date

2010

Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1080/09296174.2010.485447

Collection

Archives ouvertes



Citer ce document

Nicolas Turenne, « Modeling Noun-Phrases Dynamics in Specialized Text Collections », HAL-SHS : sociologie, ID : 10.1080/09296174.2010.485447


Métriques


Partage / Export

Résumé 0

The science of biology has entered a new era with new approaches for information processing frameworks and high-throughput experiments. This has led to a high rate of publication production and the emergence of large accessible databases in English, permitting the creation of text collections in any specialized domain. To process such text data, systematic analysis of language properties is helpful and benefits from a distribution description. In this article, firstly, as scientific publications are time-stamped we can analyse distribution profiles of noun-phrases (i.e. “content-words”) over time. Hence, time-dependency analysis of noun-phrases reveals interesting specific behaviour taking into account sequential occurrence of features. Single content-word distributions appear to be linearly shaped. We also observed that the association of content-words is distributed in a different way over time, i.e. as a mixed beta distribution.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Exporter en