1. K-Means Algorithm
- Randomly choose xxx points as centroids, i-th is μi\mu_iμi
- Divide all points into xxx groups by determining the minimum distance they have from all xxx centroids
- Change the centroids into the average of each groups
- Repeat until all centroids do not change
2. Obtimization Objective of K-Means
Let c(i)c_{(i)}c(i) denote the group i-th point belongs to, then our task is
minc,μJ(c,μ)=∑(x(i)−μc(i))2min_{c,\mu} \quad J(c,\mu)=\sum (x^{(i)}-\mu_{c_{(i)}})^2minc,μJ(c,μ)=∑(x(i)−μc(i))2
3. Random Initialization
Randomly pick kkk examples in which kkk is the number of centroids
May be stuck in local optima: Init and Run K-Means for many times, pick the solution with lowest JJJ
4. Choose the Number of Clusters
Elbow method / Depending on later purpose