Optimize clustering with specific number of elements per cluster

2 views (last 30 days)
David Coventry
David Coventry on 7 Jun 2016
Edited: Image Analyst on 7 Jun 2016
Okay, so I've been at this problem for a couple weeks now and I seem to have hit a snag.
The idea is this to take a group of 250 x and y coordinates and cluster them into different groups, but each group can only have a maximum of 15 points with a tolerance of +- 2.
I've tried using a few variations of Kmeans, but unfortunately it doesn't quite do the trick since kmeans can't take into account a limit to points in a group.
here are the initial points (axes removes for security of information):
and here is an attempt where I used multiple iterations of kmeans to divide the data:
You can see that it is pretty close, but there are some groups that have too few points.
Attached you'll find 2 files. The Optimization Test file is the one where i generated those figures using multiple iterations of kmeans, and the kmeansModified is a piece of code I wrote (sorry for all of the comments, its a work in progress) which tries to solve the number of elements per group problem, but I don't think I did a good job at it:
Anyone have any ideas on this? I understand that this is a pretty complex problem, and from what I have researched, there isn't a good way of trying to solve this problem, but I'd still like to see what your opinions are!
thanks,
David
  1 Comment
Image Analyst
Image Analyst on 7 Jun 2016
No files are attached. Can you supply more background (the "use case") on why this is needed. It's an interesting problem and is very tough. I'm no expert in optimization so the only thing I could think of was to iteratively use pdist2() to find the group of 15 points that had the lowest average distance between then and classify them as one group. Then remove those points and repeat with the remaining points. I think it might work good at the start but when you start to get to where there are few remaining points, those points might be very widely separated. So the points in the early clusters would be close together but in later clusters, they'd be far apart.

Sign in to comment.

Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!