Clustering
Unsupervised learning introduction
-
Which of the following statements are true? Check all that apply?
-
Clustering is an example of unsupervised learning.
-
In unsupervised learning, you are given an unlabeled dataset and are asked to find “structure” in the data.
-
In unsupervised learning, the training set is of the form {x^{(1)},x^{(2)}, … ,x^{(m)}} without labels y^(i)
-
K-means algorithm
K-means algorithm
- input:
- K (number of clusters)
- Training set {x^(1), x^(2), … ,x^(m)}
x^(i) ∊ R^n (drop x0=1 convention)
- Randomly initialize K cluster centroids μ1, μ2, … , μk ∊ R^n
Repeat { for i = 1 to m c^(i) := index (from 1 to k) of cluster centroid closest to x^(i) for k = 1 to k μk := average (mean) of points assigned to cluster k }
K-means for non-separated clusters