Github Data Analysis and Recommender System
Project Mentor:
Asst. Prof. Anuj Mahajan
FCSE, SMVDU
Presented By:
Akshay Pratap(2011ECS01)
Rishabh Shukla(2011ECS13)
Github is a web-based Open Source Contribution and version control platform, where developers from all around the world contribute into Open-source projects(Repositories)
Data Analysis of huge amount of open Github Data, where we tried to find some deep patterns among popularity and spatial distributions of programming languages and users on Github.
It further employs a content-based filtering approach, coupled with Apache Spark to develop a recommender system, for Github users.
Proportion of Users from various Companies
What languages are being used in various companies?
Ruby Spatial Density - USA
User Demographics - Europe
Returns a bounded [0,1] value, with similar users having higher value and vice-versa.
It is imperative for programmers to keep up with latest technologies in the computer science field. This analysis of Github Data provides an overview of technologies being used around the globe and even the spatial distributions of these technologies as well as of users.
Furthermore, Github Recommendation system is an attempt to bring more contributions to open source world by providing personalized recommendations about Github repositories.
Thank you.