Command "cluster" with big data: it used to work fast but now it works slow

2 views (last 30 days)
Hello!
I have matrix in variable " dat".
Number of rows = 564372
Number of columns = 11
Each row represents an observation and I need to cluster this data. Command " kmeans" works fast and now I'm trying agglomerative clusterisation. I computed linkage (it took about 8 hours) with the command:
Z=linkage(dat,'centroid','euclidean','savememory','on');
Then I came home and I computed few cluster with different thresholds:
T=cluster(Z,'cutoff',1.4);
I was extremely surprised when I saw that the cluster computation took only 10-15 seconds and the result was fine. Then I saved my linkage data:
Z=dlmwrite('Z-linkage.txt',Z);
Next day I launched Matlab, imported Z-linkage.txt and tryed to compute cluster again. But for this time it works very slow. It may take hours and I don't have any idea what is the problem?
Please help!
Thank you for any suggestion

Answers (1)

John D'Errico
John D'Errico on 30 Nov 2016
Since we have absolutely nothing to go on about the actual data, I can only guess.
Clustering tools usually use random starts. That means you may get lucky some times, seeing rapid convergence.
  1 Comment
Kerim Khemraev
Kerim Khemraev on 1 Dec 2016
Edited: Kerim Khemraev on 1 Dec 2016
Thank you for reply,
May be, but I computed clusters many times (about 50 times) during one session in Matlab. And it was fast. I don't think that I got many times successful random start.
I can share my data. Here it is:

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!