Wine-Data-Clustering

The goal of this notebook was to introduce and perform clustering algorithms on white wine dataset. Clustering (or grouping) allows us to identify homogeneous groups and recognize pattens within the data without any ground truth labels. We developed these clustering models to do the unsupervised learning:

k-means,
agglomerative,
spectral.

We also have proved that dimensionality reduction is an essential tool to make sense of the data in the absence of supervision information and applying PCA method improved the clustering process. Below are listed basic scores achieved for each algorithm:

Method	Silhouette	Caliński- Harabasz	Davies- Bouldin	Cluster 0	Cluster 1	Cluster 2
k-Means	0.2116	1261.7120	1.6024	1075	1308	1578
Agglomerative	0.1812	1033.0347	1.6782	1886	1382	693
Spectral	0.2004	1204.9508	1.6378	1229	1403	1329

Based on evaluation metrics in the table, the k-means algorithm performed the best on this dataset.

Reference: https://archive.ics.uci.edu/ml/datasets/wine+quality

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
ara_ararauna.jpg		ara_ararauna.jpg
parrots_img_compression.ipynb		parrots_img_compression.ipynb
wine_clustering_notebook.ipynb		wine_clustering_notebook.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Wine-Data-Clustering

About

Releases

Packages

Languages

msikorski93/Wine-Data-Clustering

Folders and files

Latest commit

History

Repository files navigation

Wine-Data-Clustering

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages