This project aims to predict customer churn using machine learning algorithms. The goal is to identify customers who are likely to stop using a service or product, allowing businesses to take proactive measures to retain them. The project includes the following key components:
- Data Preprocessing: The project starts by loading and preprocessing a dataset containing customer information and behavior. This involves handling missing values, encoding categorical variables, and scaling numerical features.
- Model Evaluation: The project evaluates the performance of different machine learning algorithms, including Decision Trees, Random Forests, Support Vector Machines, and more. The models are trained and tested using a variety of metrics, including accuracy, precision, recall, and F1-score.
- Model Selection: The project selects the best-performing model based on the evaluation metrics and uses it to make predictions on new, unseen data.
- Python 3.11 or higher
- Required libraries:
pandas
,numpy
,scikit-learn
,matplotlib
,seaborn
,PyYAML
- Clone the repository:
git clone https://github.com/your-username/Churn-Prediction.git
- Install required libraries:
pip install -r requirements.txt
- Run the main script:
python main.py
- Use command-line arguments to customize the project:
-c
or--classifier
: specify the classifier to use (default:AdaBoost
)-t
or--test-size
: specify the test size for train-test split (default:0.2
)-p
or--preprocessing-method
: specify the preprocessing method to use (default:standardization
)
python main.py
: run the project with default settingspython main.py -c KNN -t 0.5 -p robust-scaling
: run the project with KNN classifier, 50% test size, and robust scaling preprocessing
main.py
: main entry point for the projectconfig.py
: configuration file for the projectdataset.py
: data loading and preprocessing moduleclassification.py
: classification moduleplot_confusion_matrix.py
: confusion matrix plotting modulerequirements.txt
: required libraries for the project