This presentation will teach you the basics of clustering, including:
\[dist(\vec{x}, \vec{y}) = ||\vec{x} - \vec{y}||^2_2\]
Two main categories:
Goal: Break up the image into meaningful or perceptually similar regions
An iterative clustering algorithm:
Initial random centers (K=2)
Assign points to nearest center
Repeat until convergence
Change the cluster center to the average of the assigned points
Let’s implement K-means clustering step by step!
# Generate 3 clusters
cluster1 = np.random.normal(loc=[2, 2], scale=0.5, size=(100, 2))
cluster2 = np.random.normal(loc=[8, 3], scale=0.5, size=(100, 2))
cluster3 = np.random.normal(loc=[5, 7], scale=0.5, size=(100, 2))
# Combine all data
data = np.vstack([cluster1, cluster2, cluster3])
# Number of clusters
k = 3
# Initialize centers randomly
centers = data[np.random.choice(len(data), k, replace=False)]
def assign_clusters(data, centers):
"""
Assign each data point to nearest center
Args:
data: Array of data points (n_samples, n_features)
centers: Array of cluster centers (k, n_features)
Returns:
Array of cluster assignments for each point
"""
distances = np.sqrt(((data - centers[:, np.newaxis]) ** 2).sum(axis=2))
return np.argmin(distances, axis=0)
Part 1
Part 2
Part 3
Part 4
Let’s segment an image using K-means
This line sets the stopping criteria: either 100 iterations or 85% accuracy.
Random Centers are initally chosen.
Penn State ACM MLPSU