March 14, 2016
The Santander Group is the largest bank in the Eurozone with a market capitalization of €65,792M [4Q’15].
 Quarterly Shareholder Report October - December 2015
Based on historical data, we want to identify inactive customers (99K)
Performs at least 3 transaction with the account in the last 90 day
Have an average volume in the last 6 months >= pre-determine amount
TP – Predicted as inactive and are truly inactive
TN – Predicted as active and are truly active
FP – Predicted as inactive and are truly active
FN – Predicted as active and are truly inactive
Precision rate - TP / (TP + FP)
Recall rate - TP / (TP + FN)
Two models Decision Tree and Naïve Bayes are selected based on the class recall and class precision.
Decision tree is used for scoring Dec 2014 active customers (N0)
Cost (Acquiring new customers) > Cost (retaining a customer)
Can we use transactional data to predict the level of satisfaction of a customer?
1. Nivel Satisfaccion ~ nominal variable with values 0,1,2.
2. Predict_binary ~ binary variable with values 0 (for 0) and 1 (for 1 and 2).
Predict the customer satisfaction level
0's - 133 ~ 10.6%
1's - 419 ~ 33.4%
2's - 703 ~ 56%
Cost sensitive-Random forest is selected based on the class recall and class precision.
Cost sensitive-Random forest
Cost sensitive-Random forest is used for scoring satisfaction for 1Q15