A Gold Anaphora Annotation Layer on an Eye Movement Corpus

Fiche du document

Date

7 mai 2018

Discipline
Type de document
Périmètre
Langue
Identifiants
Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess



Sujets proches En

Pronouns

Citer ce document

Olga Seminck et al., « A Gold Anaphora Annotation Layer on an Eye Movement Corpus », HAL-SHS : linguistique, ID : 10670/1.j0jg53


Métriques


Partage / Export

Résumé En

Anaphora resolution is a complex process in which multiple linguistic factors play a role, and this is witnessed by a large psycholinguistic literature. This literature is based on experiments with hand-constructed items, which have the advantage to filter influences outside the scope of the study, but, as a downside, make the experimental data artificial. Our goal is to provide a first resource allowing to study human anaphora resolution on natural data. We annotated anaphorical pronouns in the Dundee Corpus: a corpus of ∼ 50k words coming from newspaper articles read by humans of whom all eye movements were recorded. We identified all anaphoric pronouns — in opposition to non-referential, cataphoric and deictic uses — and identified the closest antecedent for each of them. Both the identification of the anaphoricity and the antecedents of the pronouns showed a high inter-annotator agreement. We used our resource to model reading time of pronouns to study simultaneously various factors of influence on anaphora resolution. Whereas the influence of the anaphoric relation on the reading time of the pronoun is subtle, psycholinguistic findings from settings using experimental items were confirmed. In this way our resource provides a new means to study anaphora.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en