Data Clustering: K-means, MST based
Generate Data
Go to Control Panel → Data Generation.
- Adjust the "Spread" (variance) using the slider. This controls how scattered the points will be.
- Click "Generate 250 Random Points" to populate the canvas with points.
- Optional: Click "Clear" if you want to remove all points and start over.
Set Algorithm Parameters
In Algorithm Parameters:
- K (Number of Clusters): Adjust using the slider (between 2 and 10).
- Distance Metric: Choose between Euclidean, Manhattan, or Chebyshev.
- Initialization Method: Only "Random" (K-means++ and others can be added later).
- Maximum Iterations: Adjust the upper bound for clustering convergence.
Adjust Visualization Options (Optional)
Toggle features to enhance clarity:
- Show Centroids: To see centroids of the clusters
- Show Cluster Boundaries (Voronoi): To show the boundaries of the cluster
- Show Centroid History: The path through which the centroid shifted from intial step to final step
- Show Grid: To show the gridlines in the graph
Controls
Button | Action |
---|---|
Initialize | Sets up initial centroids and allows simulation control. |
Step | Perform one K-means iteration. |
Run | Starts the simulation, iteratively updating clusters and centroids. |
Pause | Temporarily halts the ongoing simulation. |
Reset | Clears clustering progress but keeps data points intact. |
In the Info Panel, observe:
- Iteration Count and Convergence Status
- Point Count and Cluster Count
- Metrics:
- Intra-cluster Variance
- Silhouette Score
- Davies-Bouldin Index
- Cluster Statistics: Size, centroid coordinates, variance per cluster.
Export the Data
- Export Image: To export the image of the current simulation state
- Export Data: To export the csv of the current simulation state