Zainab Rubaidi PhD thesis defense

Title: Intelligent Data Augmentation Techniques Towards Multidomains Fraud Detection through Data Mining.
Supervisors: Dr. Mohamed Ben Aouicha & Dr. Boulbaba Ben Ammar
Defense Date: 28 December 2024
Fraud, characterized by deceptive or dishonest behavior typically conducted for financial gain in various domains, poses a significant threat across different industries, leading to substantial economic losses and erosion of trust. In response to this pervasive challenge, effective fraud detection mechanisms are paramount. This thesis explores innovative techniques for enhancing fraud detection in binary classification datasets, crucial for safeguarding against financial losses and maintaining consumer trust. The focus lies on developing and evaluating intelligent data augmentation methods within advanced data mining frameworks.
Central to this research is the Regressive Hybrid Data Oversampling (RHDO) approach, which integrates diverse oversampling techniques like SMOTE, ADASYN, BorderLine SMOTE, and SVM-SMOTE. Coupled with machine learning classifiers such as Logistic Regression, Naïve Bayes, Random Forest, and XGBoost, these techniques are evaluated across multiple datasets, including Vehicle Insurance Fraud, Predictive Maintenance, and synthetic data. The experimental results indicated that RHDO achieves promising performance across all datasets. On the Vehicle Insurance Fraud dataset, RHDO achieves an F1-score of 0.517722 and an AUC of 0.516764. Similarly, on the Predictive Maintenance dataset, RHDO achieves an F1-score of 0.562996 and an AUC of 0.545739. On the synthetic dataset, RHDO exhibits an F1-score of 0.889132 and an AUC of 0.884716. These outcomes position RHDO as a potent solution for mitigating class imbalance in binary classification tasks.
Moreover, comparative analyses with other studies underscore efficacy of using data oversampling techniques with machine learning algorithms in enhancing fraud detection accuracy across various domains. Notably, on the STEG dataset, Random Oversampling coupled with the Random Forest classifier achieves an outstanding accuracy of 99.3\%. This comparative assessment highlights our proposed methodology's efficiency in improving fraud detection accuracy and its potential for broader application.
Mohamed Ben Aouicha
Mohamed Ben Aouicha
Professor

My research interests concern information retrieval, semantic technologies, social media analytics, knowledge representation, Big Data and graph embedding.

Boulbaba Ben Ammar
Boulbaba Ben Ammar
Assistant Professor

Boulbaba Ben Ammar is an Assistant Professor at the Computer Science Department, University of Sfax, Tunisia.