Know     

ledge 

Git

Know

G

roup @

Yu-Chung Hsiao

GitHub's Explore

  1. Only 34 topics for 13 million repos
  2. Last update has been a while ago

Problem of finding a repo

GitHub's Search

Currently
in use

A lot of info not  used yet

scipy / scipy

Know     

Git

Comparison

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

NEW!

GitHub's Explore
​recommended

Algorithm: data to clustering

Descriptions,
Readme files

Tags as keywords 

Natural
Language
Processing

Sparse
vectors

Similarity
matrix

Clustering:

Affinity
Propagation

scikit-
learn

Store in
database

Clusters,
Centers,

Similarity
matrices

repo
names

Visualize

API

Readme &
Description

Sparse
​vector

NLP

Pull out the
closest cluster

Compute
similarities
        to
              each 
                center

Network
graph

generate

Input

Front-end

Back-end

Algorithm: clustering in action

Yu-Chung Hsiao

Fisherman's Wharf in SF

Predictive modeling for
nonlinear time series

Made with Slides.com