Customer Churn
Customer Churn
๐ Customer Churn Prediction
๐ Project Overview
This project aims to predict customer churn using machine learning models based on customer demographics, account details, and service usage patterns. The goal is to help businesses identify high-risk customers and take proactive retention measures.
๐ Data Overview
- The dataset includes customer demographics, account information, and service usage details.
- The target variable is โChurn Labelโ, indicating whether a customer has churned (Yes/No).
๐ Exploratory Data Analysis (EDA)
Key insights discovered during EDA:
- Customers with higher monthly charges are more likely to churn.
- Long-tenured customers have lower churn rates.
- Month-to-month contract holders have the highest churn rate compared to other contract types.
๐ ๏ธ Data Preprocessing
- Handled missing values in the โTotal Chargesโ column by replacing empty values with the mean.
- Converted the โTotal Chargesโ column to a numerical format (float64).
- Applied SMOTE (Synthetic Minority Over-sampling Technique) to address class imbalance.
๐น Feature Engineering
Created new features to improve model performance:
- Tenure Buckets: Categorized customers based on tenure duration.
- Monthly Charges & Total Charges Features: Derived additional insights from spending patterns.
๐น Model Development & Evaluation
Tested multiple machine learning models, including:
- Logistic Regression
- Random Forest Classifier
- Gradient Boosting Models (XGBoost, LightGBM)
๐ Why Random Forest?
- Achieved the highest F1-Score of 0.870 while maintaining robustness.
- Handles complex datasets effectively and reduces overfitting.
- Performs well even with limited feature engineering.
๐ Model Performance (Random Forest)
Metric | Score |
---|---|
Precision | 0.857 |
Recall | 0.808 |
F1-Score | 0.832 |
Accuracy | 0.823 |
The balance between Precision, Recall, and F1-Score indicates that the model effectively minimizes false positives and false negatives.
๐ Future Improvements
- Deploy the model as an interactive web app for real-time predictions.
- Improve interpretability using SHAP (SHapley Additive Explanations) to explain feature importance.
- Test deep learning models (e.g., Neural Networks) for potential improvements.
๐ GitHub Repository
This post is licensed under CC BY 4.0 by the author.