top of page
Writer's pictureAaditya Bansal

Credit Card Fraud Detection

Credit card fraud is a significant issue that costs billions of dollars every year to the financial industry and a lot of incontinence for users. Detecting fraudulent transactions is essential to protect the banks and their customers from financial losses and provide an overall better experience.


In this project, I aimed to build a model that could detect fraudulent transactions with high accuracy, when creating a Machine Learning model to perform this task I took special care so that the model can detect most of the frauds while also not interrupting the normal transactions, so users can carry out day to day transaction with no inconvenience.


Here's how I completed the Credit Card Fraud Detection Project:

 

Data Collection and Cleaning


I used the Credit card Transactions dataset available on Kaggle. The dataset contained transaction data from European credit cardholders. There were 284,807 transactions, out of which only 492 were fraudulent. The dataset was highly imbalanced, which means that the number of non-fraudulent transactions was much higher than fraudulent transactions.


The next step was to clean the data. I removed duplicate records and handled missing values from the dataset.

 

Understanding the Data


In this step I got an understanding of the data I would be working with. I used the feature statistics and distribution of features like Amount and Time and looked for patterns of transactions over time. Also looked at the class distribution in dataset.

 

Data Preprocessing and Feature Engineering


I also created new feature by extracting the Hour of day from the Time feature. I used Min Max Scaler, Standard Scaler and Robust Scaler separately to scale the features to bring all the features to a common scale and then compared the model performance using different scalers. I also removed the unnecessary features.

 

Model Building and Evaluation


In this step I performed multiple steps to prepare the data for model training like:

  • Separating the input features and target variables.

  • Split the data into train and test sets.

  • Handling Imbalanced data using SMOTE

Next, I defined a function to evaluate the model performance using metrics such as Precision, Recall, F1 Score, Confusion Matrix.


I used Cross validation to build and evaluate multiple machine Learning models including Isolation Forest Classifier, Logistic regression Classifier, K Neighbours Classifier and Decision Tree Classification Models.


After that I tried to take a closer look at each model and fine tune the performance by testing hyperparameters.


After building the models, I evaluated their generalization performance using different metrics like precision, recall, F1-score and Confusion Matrix.


Performed model selection by comparing the model performances.


Two of our models have best performance:

  • K Neighbours classifier - but it takes a little longer to make predictions

  • Logistic Regression

 

Export the Trained Model for Deployment


Trained the final selected model using the complete dataset for better performance on future predictions.


Saved our Model Objects as byte (Pickle) files for making prediction in future.

 

Conclusion


Building a credit card fraud detection model is a challenging task due to the highly imbalanced nature of the dataset. However, with careful data preprocessing and model building, I achieved high accuracy, precision in detecting fraudulent transactions.


In this project, I used various machine learning algorithms to build the model and evaluated their performance using different metrics. Hyperparameter tuning was used to optimize the models.


I was able to develop a successful fraud detection machine learning project using publicly available datasets. By following these steps, I was able to create a robust fraud detection system that can help prevent financial losses and protect customers from fraud.


The final (K Neighbours and Logistic regression) models outperformed the other models.


Overall, this was a challenging and rewarding project that helped me develop my skills in data preprocessing and machine learning I'm excited to continue building and improving my machine learning projects in the future!!

 

Thank you for your time 😍🤗

33 views0 comments

Comments


bottom of page