Corpus analysis of coreference chains : annotation and good-enough representations (in a French corpus) Analyse en corpus de chaînes de coréférence : annotation et représentations good-enough (dans un corpus en français) En Fr

Fiche du document

Date

17 février 2021

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes



Sujets proches En

Expressive behavior

Citer ce document

Marine Delaborde, « Analyse en corpus de chaînes de coréférence : annotation et représentations good-enough (dans un corpus en français) », HAL-SHS : linguistique, ID : 10670/1.7rl5ob


Métriques


Partage / Export

Résumé En

A coreference chain designates the set of linguistic expressions that refer to the same entity. The coreference relation between a chain's elements implies that the referent must be strictly the same for each expression that composes it. However, the referent of an expression is sometimes difficult to identify and the coreference relation between several expressions cannot therefore be strict without any doubt. For a reader, this lack of precision does not necessarily presents difficulties. Nevertheless, the coreference annotation task of a corpus consists in unequivocally identifying the referent of each expression. Non-strict coreference phenomena can therefore generate annotation difficulties. The members of the ANR Democrat project have produced a corpus annotated in coreference. This is an annotation of exact coreference. However, the analysis of the annotation the french « on » pronoun in this corpus reveals a great variability in the annotation of these phenomena from which we derive a classification. To avoid the annotation difficulties related to these phenomena, a more precise framework for the annotation of fuzzy coreference could be considered.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en