A survey on machine learning methods for churn prediction

Fiche du document

Date

2022

Type de document
Périmètre
Langue
Identifiants
Relations

Ce document est lié à :
info:eu-repo/semantics/altIdentifier/doi/10.1007/s41060-022-00312-5

Collection

Archives ouvertes

Licence

info:eu-repo/semantics/OpenAccess




Citer ce document

Louis Geiler et al., « A survey on machine learning methods for churn prediction », HAL-SHS : droit et gestion, ID : 10.1007/s41060-022-00312-5


Métriques


Partage / Export

Résumé En

The diversity and specificities of today's businesses have leveraged a wide range of prediction techniques. In particular, churn prediction is a major economic concern for many companies. The purpose of this study is to draw general guidelines from a benchmark of supervised machine learning techniques in association with widely used data sampling approaches on publicly available datasets in the context of churn prediction. Choosing a priori the most appropriate sampling method as well as the most suitable classification model is not trivial, as it strongly depends on the data intrinsic characteristics. In this paper we study the behavior of eleven supervised and semi-supervised learning methods and seven sampling approaches on sixteen diverse and publicly available churn-like datasets. Our evaluations, reported in terms of the Area Under the Curve (AUC) metric, explore the influence of sampling approaches and data characteristics on the performance of the studied learning methods. Besides, we propose Nemenyi test and Correspondence Analysis as means of comparison and visualization of the association between classification algorithms, sampling methods and datasets. Most importantly, our experiments lead to a practical recommendation for a prediction pipeline based on an ensemble approach. Our proposal can be successfully applied to a wide range of churn-like datasets.

document thumbnail

Par les mêmes auteurs

Sur les mêmes sujets

Exporter en