Exploratory Data Analysis (EDA) using Jupyter -- 3

Job Description:

The key is to identify customers at risk of churn early, before they get too far down the path, and to take

preemptive action to retain them and improve the customer relationship.

This is where Advanced Analytics (Machine Learning and Deep Learning modeling) comes into play.

Using Data Science and AI approaches to predict which of the customers will be loyal and which will lapse.

Your ML model(s) will help the company’s executives to understand what makes a great or bad customer

so they can take action.

Teaching machines to predict customer behavior, and communicate which customer attributes predict

specific behaviors, allows management to help build an organizational playbook for acquiring and keeping

happy customers. For example, a client success organization can reach out before there's a problem, their

marketing department can reach new customers that are less likely to churn, and their sales organization

can bring these better customers on board.

This problem is a typical classification task. You must build Machine Learning models to predict whether

a customer will churn or not.

You are asked to perform two (2) stages of analysis, based on different distribution of data:

1- Fit your models on the original (given) dataset

2- Fit your models on the modified dataset, after applying the SMOTE technique.

The Machine Learning models to be used are:

 Naïve Bayes

 Logistic Regression

 Random Forests

 XGBoost

The metrics of performance of the chosen models should be, with huge emphasis to recall:

 Accuracy

 Precision

 Recall

 F1-Score

You should follow the standard ML workflow process while building your models:

 Explanatory Data Analysis (already done!)

 Data Visualization (already done!)

 Data Preprocessing (Data Imputation, Feature Selection & Scaling, Encode Categorical Features)

(partially done)

 Address Data Imbalance (apply the SMOTE technique)

 Split the training/test datasets in the 80/20 % ratio

 Algorithm Selection

 Modeling Building

 Modeling Evaluation

 Model Tuning (Hyperparameters Tuning)

