UCI heart disease
Clustering is an unsupervised learning technique used to group similar data points toghether based on their characteristic.
Why we cluster data
1. Discover hidden patterns;
2. Simplify complex data;
3. Improve decision-making;
4. Preprocessing for other models.
1. sex
2. cp
3. fbs
4. restecg
5. exang
6. slope
7. thal
8. age
9. trestbps
10. chol
11. thalach
12. oldpeak
13. ca
Categorical
Continuous
1. sex
2. cp
3. fbs
4. restecg
5. exang
6. slope
7. thal
Categorical
Continuous
8. age
9. trestbps
10. chol
11. thalach
12. oldpeak
13. ca
1. To group samples into categories with correlated features.
2. To predict eventual heart disease
Goals
K-Means
Elbow-method
Elbow-method
PCA plot
PCA plot
t-SNE plot
Each feature has a different level of importance in determining the cluster assignment of a given record.
Pietro Mondini
Nicolò Moroni
Alessandro Crippa