Command "cluster" with big data: it used to work fast but now it works slow
2 views (last 30 days)
Show older comments
Hello!
I have matrix in variable " dat".
Number of rows = 564372
Number of columns = 11
Each row represents an observation and I need to cluster this data. Command " kmeans" works fast and now I'm trying agglomerative clusterisation. I computed linkage (it took about 8 hours) with the command:
Z=linkage(dat,'centroid','euclidean','savememory','on');
Then I came home and I computed few cluster with different thresholds:
T=cluster(Z,'cutoff',1.4);
I was extremely surprised when I saw that the cluster computation took only 10-15 seconds and the result was fine. Then I saved my linkage data:
Z=dlmwrite('Z-linkage.txt',Z);
Next day I launched Matlab, imported Z-linkage.txt and tryed to compute cluster again. But for this time it works very slow. It may take hours and I don't have any idea what is the problem?
Please help!
Thank you for any suggestion
0 Comments
Answers (1)
John D'Errico
on 30 Nov 2016
Since we have absolutely nothing to go on about the actual data, I can only guess.
Clustering tools usually use random starts. That means you may get lucky some times, seeing rapid convergence.
See Also
Categories
Find more on Cluster Analysis and Anomaly Detection in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!