Internet Service Provider
Customer Churn Prediction
By:
Abhishek Sharma
Table of Contents
- Introduction
- Problem Statement
Introduction
There is a big competition among Internet providers. If an ISP wants to increase its revenue they need more subscribers but keeping existing customers is more important than having new ones.
So a provider wants to know which customer will cancel his service (churn). If they will know who's about to churn then, maybe they can catch them with promotions.
Problem Statement
To predict if an existing customer will churn or not based on their internet usage.
Data Description
Text
Methodology
Through web scraping, I collected housing prices data from makaan.com
Many null values and irrelevant stuff were present in the data.
So, after cleaning the dataset I converted it into this form.
Through Geocoding API of mapmyindia I fetched latitudes, longitudes, PIN codes and district for each area in my data set
I plotted fetched coordinates on the map of Delhi using folium geolocation and maps visualization library.
Using Foursquare API, for every location I fetched a maximum of 100 venues like ATMs, Restaurants, etc. that lies within a 1000 m radius from the longitude and latitude of center of the location.
I used the Process of One Hot Encoding to convert Categorical values into Binary values and then I calculated their frequencies.
Then, I found the Top 5 most common venues occurring in an area.
Now, It is time to cluster this data using the K-Means Clustering Algorithm. The results obtained from clustering is as follows:
Then, I normalized the housing prices data frame using Min-Max to create a rating scale from (0-100) for housing prices.
After this, I created bins and generated a histogram.
Results
The Results of this clustering are as follows:
-
Cluster-0 (tomato) has a higher no. of ATMs, Indian Restaurants, Markets, Shops. These are the locations where people go shopping or hang out.
-
Cluster-1 (mediumpurple) has a higher no. of Hotels, Asian Restaurants, Gyms and Neighborhoods. So these are residential areas.
-
Cluster-2 (mediumtorquoise) has a higher no. of Gardens and parks. These areas are greener than other areas.
-
Cluster-3 (aquamarine) has airports and also some high-end facility providers. These areas are downtown and posh areas.
-
Cluster-4 (burlywood) has a higher no. of metro stations, pizza places and other businesses. These areas are well-connected areas by metros.
For economic rating, I assigned categories to the areas based on their price ratings.
Categories:
-
Rating(0-20)=> Category I
-
Rating(21-40)=> Category II
-
Rating(41-60)=> Category III
-
Rating(61-80)=> Category IV
-
Rating(81-100)=> Category V
Higher the prices of houses higher are the chance that people with better economic conditions live in that area.
People living in the area of Category V has the highest economic strength and people living in the area of Category I have lowest economic strength.
Discussion
Every day people are turning to big cities to start a business or find jobs and all these people require a place in Delhi. For this reason, people can achieve better outcomes through their access to the platforms where such information is provided.
Not only for investors but also city managers can manage the city more regularly by using similar data analysis types or platforms.
Conclusion
The maps generated through this project are useful for the vast majority of people. Many countries and companies generate these kinds of maps, which help them in deeply analyzing the present conditions of the state. There are so many companies that are democratizing data like this, which gives insights for businesses. Geographical and Economic insights are a boon for everyone, whether it is a big company or an individual.
ISP Churn Prediction
By Abhishek Sharma
ISP Churn Prediction
- 9