K-means Clustering
Let A1=(3,8), A2=(9,4), A3=(4,9), A4=(2,2), A5=(10,5), A6=(2,4), A7=(6,8), A8=(4,3), A9=(4,7)\text{A1=(3,8), A2=(9,4), A3=(4,9), A4=(2,2), A5=(10,5), A6=(2,4), A7=(6,8), A8=(4,3), A9=(4,7)}

In this problem we will use K=3K=3
Goal: Find member and centroid of each cluster
Step
- Choose K-centroid randomly (It can be any points)
- Find the nearest centroid cjc_{j} for each point and assign each point to those centroid
In this problem we use Euclidean Distance
Distance(A,B)=(AX−BX)2+(AY−BY)2\text{Distance}(A, B) = \sqrt{ (A_{X} - B_{X})^2 + (A_{Y} - B_{Y})^2 }Let assume that we have assigned each point to its nearest centroid using Euclidean distance as a matric distance and get this result.
- j1 has [A1, A3, A7, A9]
- j2 has [A4, A6, A8]
- j3 has [A2, A5]
- Find new centroid cjc_{j} and see if cluster member change or not
- If cluster member remain the same : Finish!
- If cluster member change : Find new centroid again!
from j = 1 to K
cj(X)=1n∑i=1nxicj(Y)=1n∑i=1nyicj=(cj(X), cj(Y))\begin{align} c_{j}(X) = \dfrac{1}{n}\sum_{i=1}^{n}x_{i} \\ c_{j}(Y) = \dfrac{1}{n}\sum_{i=1}^{n}y_{i} \\\\ c_{j} = (c_{j}(X),\ c_{j}(Y)) \end{align}where
n=Number of points in cluster with centroid cjxi=X coordinate of Point in cluster with centroid cjyi=Y coordinate of Point in cluster with centroid cj\begin{align} n = \text{Number of points in cluster with centroid } c_{j} \\ x_{i} = \text{X coordinate of Point in cluster with centroid } c_{j} \\ y_{i} = \text{Y coordinate of Point in cluster with centroid } c_{j} \end{align}Example
| j | cj(X)c_{j}(X) | cj(Y)c_{j}(Y) | centroid cjc_{j} |
|---|---|---|---|
| 1 | 3+4+5+64=4.25\dfrac{3+4+5+6}{4}=4.25 | 8+9+8+74=8\dfrac{8+9+8+7}{4}=8 | (4.25, 8)(4.25,\ 8) |
| 2 | 2+2+43=2.667\dfrac{2+2+4}{3}=2.667 | 4+3+23=3\dfrac{4+3+2}{3}=3 | (2.667, 3)(2.667,\ 3) |
| 3 | 10+92=9.5\dfrac{10+9}{2}=9.5 | 4+52=4.5\dfrac{4+5}{2}=4.5 | (9.5, 4.5)(9.5,\ 4.5) |
- j1 has [A1, A3, A7, A9]
- j2 has [A4, A6, A8]
- j3 has [A2, A5]
Member of cluster is not changed so FINISH!
Result
