Cluster within percentage of data
Show older comments
Hi, I intend to form clusters from a data of 4 variables. These 4 variables are design parameters and each row of the data (200,000 X 4 matrix) correspond to a particular design. I wanted the clusters to be formed in such a way that the similar designs are clubbed together and we can deal with the centroid of the clusters instead of dealing with all the data sets. However, the kmeans cluster use euclidean distance to cluster. This does not serve the purpose as it would put (1,1000) and (1000,1) into the same cluster, in case of 2 variable format. The two designs would be completely different.
What I wanted was that the cluster contains the data sets which are x% of the variable values at the centroid of each cluster. Let's say we have a cluster with centroid (20,10,100,50), then all the data sets in the cluster should be (20+-2,10+-1,100+-10,50+-5) for x=10%. I couldn't find any method in the cluster analysis which could serve the above purpose. Please let me know if the logic I am trying to follow is flawed
5 Comments
Matt Kindig
on 31 Jul 2013
I'm not entirely sure I understand what you are doing, but would scaling each column (perhaps by the maximum value of the column) help? You might also want to recenter each column (perhaps by the median of the dataset). Thus, each column would be weighted equally when the clustering algorithm calculates the euclidean distance.
Image Analyst
on 31 Jul 2013
To use kmeans you specify k (the number of clusters). How many clusters are you expecting to see in this 4D data set?
Saurav Agarwal
on 2 Aug 2013
Saurav Agarwal
on 2 Aug 2013
Jing
on 28 Aug 2013
So you know the centroid of each cluster or not? If you know the centroid, I'd say KNNSEARCH could help better. Anyway, your idea looks like a new algorithm to me...
Answers (0)
Categories
Find more on Cluster Analysis and Anomaly Detection in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!