Article révisé par les pairs
Résumé : Imbalanced learning jeopardizes the accuracy of traditional classification models, particularly for what concerns the minority class, which is often the class of interest. This paper addresses the issue of imbalanced learning in credit card fraud detection by introducing a novel approach that models fraudulent behavior as a time-dependent process. The main contribution is the design and assessment of an oversampling strategy, called 'Adversary-based Oversampling' (ADVO), which relies on modeling the temporal relationship among frauds. The strategy is implemented by two learning approaches: first, an innovative regression-based oversampling model that predicts subsequent fraudulent activities based on previous fraud features. Second, the adaptation of the state-of-the-art TimeGAN oversampling algorithm to the context of credit card fraud detection. This adaptation involves treating a sequence of frauds from the same card as a time series, from which artificial frauds' time series are generated. Experiments have been conducted using real credit card transaction data from our industrial partner, Worldline S.A, and a synthetic dataset generated by a transaction simulator for reproducibility purposes. Our findings show that an oversampling approach incorporating time-dependent modeling of frauds provides competitive results, measured against common fraud detection metrics, compared to traditional oversampling algorithms.