K-means without iteration
Show older comments
Hello,
My problem is that it is difficult to get the optimal cluster number by using k-means, so I thought of using a hierarchical algorithm to find the optimal cluster number. After defining my ideal classification I want to use this classification to find the centroids with k-means, without iteration.
if true
data= rand(300,5);
D = pdist(data);
Z = linkage(D,'ward');
T = cluster(Z,'maxclust',6);
end
Now I want to use the clusters defined in vector T and the positions in to k-means algorithm without iterations. Can anyone give a tip how to do?
Thank you.
Accepted Answer
More Answers (1)
Manuel
on 1 Mar 2013
1 Comment
Tom Lane
on 4 Mar 2013
I would not expect the hierarchical and k-means results to match. Even though you're using Ward's linkage which is based on distances to centroids, the centroids shift around as the clustering progresses. The I value you computed is the result intended to simulate the "k-means without iteration" process you requested.
Here's an attempt to show what is going on. We have each point clustered using hierarchical and k-means clustering, with a voronoi diagram superimposed. The k-means values match the voronoi regions, but the hierarchical values sometimes do not.
data = rand(200,2);
Y = pdist(data);
Z = linkage(Y,'ward');
T = cluster(Z,'maxclust',6);
means = grpstats(data, T);
D = pdist2(data,means );
[C,I] = min(D,[],2);
gscatter(data(:,1),data(:,2),T) % hierarchical clusters
hold on
gscatter(data(:,1),data(:,2),I,[],'o',10) % k-means assignments
gscatter(means(:,1),means(:,2),(1:6)',[],'+',20) % centroids
voronoi(means(:,1),means(:,2)) % voronoi regions
hold off
Categories
Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!