Single imputation
Single imputation
Robust, simple to implement and effective when outliers are present
Single imputation
Robust, simple to implement and effective when outliers are present
Single imputation
Multiple imputation
Robust, simple to implement and effective when outliers are present
Single imputation
Multiple imputation
Robust, simple to implement and effective when outliers are present
Captures imputation variability, suitable for complex datasets
Single imputation
Multiple imputation
Robust, simple to implement and effective when outliers are present
Captures imputation variability, suitable for complex datasets
Scaling
Encoding
Continuous var.
Categorical var.
Scaling
Encoding
Continuous var.
Categorical var.
Our goals:
Silhouette Score
The Silhoutte Score mesaures how similar a point is to its own cluster compare to other clusthers.
This method helps to identify the perfect value of k.
Spectral Clustering
Spectral Clustering builds a similarity graph between the points dividing them in clusters using the eigenvectors of the Laplacian.
Spectral Clustering
Pros
Cons
Spectral Clustering: PCA visualization
Spectral Clustering: tSNE visualization
KMeans partitions data into k groups by minimizing the distance between points and their cluster's centroid.
KMeans
KMeans
Pros
Cons
KMeans: PCA visualization
KMeans: tSNE visualization
Final thoughts
In summary, both methods identify very similar cluster, showing a consistent structure in the dataset.
However, the Spectral Clustering method provides a more accurate representation of the groupings in the data.
vital_status
vital_status
vital_status
4 classifiers:
vital_status
4 classifiers:
Accuracy
Precision
Recall
percentage of correct predictions over all the predictions
percentage of correct positive predictions over all the positive predictions
percentage of correct positive predictions over all the real positive predictions
Pietro Mondini
Nicolò Moroni - 84576A
Alessandro Crippa - 84583A