Automatic Assessment of English CEFR Levels Using BERT Embeddings

Fiche du document

Date

20 octobre 2022

Discipline
Périmètre
Langue
Identifiants
Collection

OpenEdition Books

Organisation

OpenEdition

Licences

https://creativecommons.org/licenses/by-nc-nd/4.0/ , info:eu-repo/semantics/openAccess




Citer ce document

Veronica Juliana Schmalz et al., « Automatic Assessment of English CEFR Levels Using BERT Embeddings », Accademia University Press, ID : 10.4000/books.aaccademia.10828


Métriques


Partage / Export

Résumé 0

The automatic assessment of language learners’ competences represents an increasingly promising task thanks to recent developments in NLP and deep learning technologies. In this paper, we propose the use of neural models for classifying English written exams into one of the CEFR competence levels. We employ pre-trained BERT models which provide efficient and rapid language processing on account of attention-based mechanisms and the capacity of capturing long-range sequence features. In particular, we investigate on augmenting the original learner’s text with corrections provided by an automatic tool or by human evaluators. We consider different architectures where the texts and corrections are combined at an early stage, via concatenation before the BERT network, or as late fusion of the BERT embeddings. The proposed approach is evaluated on two open-source datasets: the EFCAMDAT and the CLC-FCE. The experimental results show that the proposed approach can predict the learner’s competence level with remarkably high accuracy, in particular when large labelled corpora are available. In addition, we observed that augmenting the input text with corrections provides further improvement in the automatic language assessment task.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Sur les mêmes disciplines

Exporter en