Internet Service Provider

Customer Churn Prediction

By:

Abhishek Sharma

Table of Contents

  1. Introduction
  2. Problem Statement

Introduction

There is a big competition among Internet providers. If an ISP wants to increase its revenue they need more subscribers but keeping existing customers is more important than having new ones.

 

So a provider wants to know which customer will cancel his service (churn). If they will know who's about to churn then, maybe they can catch them with promotions.

Problem Statement

 

To predict if an existing customer will churn or not based on their internet usage.

Data Description

Text

Methodology

Through web scraping, I collected housing prices data from makaan.com

Many null values and irrelevant stuff were present in the data.

So, after cleaning the dataset I converted it into this form.

Through Geocoding API of mapmyindia I fetched latitudes, longitudes, PIN codes and district for each area in my data set

I plotted fetched coordinates on the map of Delhi using folium geolocation and maps visualization library.

Using Foursquare API, for every location I fetched a maximum of 100 venues like ATMs, Restaurants, etc. that lies within a 1000 m radius from the longitude and latitude of center of the location.

 

I used the Process of One Hot Encoding to convert Categorical values into Binary values and then I calculated their frequencies.

 

Then, I found the Top 5 most common venues occurring in an area.

 

Now, It is time to cluster this data using the K-Means Clustering Algorithm. The results obtained from clustering is as follows:

 

Then, I normalized the housing prices data frame using Min-Max to create a rating scale from (0-100) for housing prices.

 

After this, I created bins and generated a histogram.

 

Results

 

The Results of this clustering are as follows:

  • Cluster-0 (tomato) has a higher no. of ATMs, Indian Restaurants, Markets, Shops. These are the locations where people go shopping or hang out.

  • Cluster-1 (mediumpurple) has a higher no. of Hotels, Asian Restaurants, Gyms and Neighborhoods. So these are residential areas.

  • Cluster-2 (mediumtorquoise) has a higher no. of Gardens and parks. These areas are greener than other areas.

  • Cluster-3 (aquamarine) has airports and also some high-end facility providers. These areas are downtown and posh areas.

  • Cluster-4 (burlywood) has a higher no. of metro stations, pizza places and other businesses. These areas are well-connected areas by metros.

 

For economic rating, I assigned categories to the areas based on their price ratings.

Categories:

  • Rating(0-20)=> Category I

  • Rating(21-40)=> Category II

  • Rating(41-60)=> Category III

  • Rating(61-80)=> Category IV

  • Rating(81-100)=> Category V

Higher the prices of houses higher are the chance that people with better economic conditions live in that area.

People living in the area of Category V has the highest economic strength and people living in the area of Category I have lowest economic strength.

 

Discussion

Every day people are turning to big cities to start a business or find jobs and all these people require a place in Delhi. For this reason, people can achieve better outcomes through their access to the platforms where such information is provided.

Not only for investors but also city managers can manage the city more regularly by using similar data analysis types or platforms.

Conclusion

The maps generated through this project are useful for the vast majority of people. Many countries and companies generate these kinds of maps, which help them in deeply analyzing the present conditions of the state. There are so many companies that are democratizing data like this, which gives insights for businesses. Geographical and Economic insights are a boon for everyone, whether it is a big company or an individual.

 

ISP Churn Prediction

By Abhishek Sharma

ISP Churn Prediction

  • 9