How can I cluster data points according to the local minima they belong to?

Question

Sepp on 14 Jun 2016

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/289793-how-can-i-cluster-data-points-according-to-the-local-minima-they-belong-to

Edited: Matt J on 15 Jun 2016

Hello

I'm using the genetic algorithm for hyperparameter optimisation. My loss function is the cross-validated loss, that means I can evaluate my loss function but I don't know how it looks like (the shape). Of course, my loss function has several local minimas.

Let's say I'm using a population size of 20, i.e. I have 20 data points in each population. Let's further assume that I have run the genetic algorithm for a certain amount of generations, so that I have my final 20 data points.

Of course it could be possible that the 20 data points lie around different local minimas.

Now, I would like to apply some sort of clustering to cluster the data points according to the local minima they seem to belong to and I would like to identify the cluster which is most promosing (which seems to contain the global minima). I'm assuming that a data point belongs to a local minima if it is in its neighbourhood.

Does somebody have an idea how this could be done?

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Matt J on 14 Jun 2016

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/289793-how-can-i-cluster-data-points-according-to-the-local-minima-they-belong-to#answer_225457

Edited: Matt J on 14 Jun 2016

I'm convinced there is no foolproof way to do it, but the following heuristic approach might work okay. The genetic algorithm is supposed to return the globally optimal solution x. For any other member y of the final population, you finely sample the objective function along the line segment [x,y]. See if the objective function is convex or maybe just pseudo-convex along this line segment. A numeric test of this would be to see if the discrete derivatives as given by diff() are all monotonically increasing. If it is convex, you assume (without certainty) that x and y are part of the same capture basin and cluster them together.

If the dimension of the problem is low enough, another approach would be to sample the objective function values on an N-dimensional grid containing the final population so that you have an N-dimensional image of these values. You could then use watershed() to find all the basins using image processing methods.

4 Comments
Show 2 older commentsHide 2 older comments

Sepp on 14 Jun 2016

Thank you so much for your answer, Matt. This seems to be a very intersting approach.

I'm running the genetic algorithm approach only for 5 generations with a population size of 20 (so that I only spend 100 function evaluations) for getting an initial estimate. Afterwards, I want to run nelder-mead on the most promising cluster. So I think the genetic algorithm will not provide the globally optimal solution in my case.

Let's assume I have the following five points in my population [x,y,z,a,b]. Let x be the point with the smallest loss fucntion value.

With your test I'm only getting two clusters, i.e. a cluster with points which seems to belong to the same local minima as x and a cluster with points that do not. But how can I create more clusters independent of x? For example a cluster could contain x,y, another cluster z and the third cluster a,b.

Second, I have read the documentation of diff() but I'm a bit unsure about how to use it. Could you give a small example for two data points x and y if both have dimension n?

Last bust not least, Let's say I'm finding 3 clusters. I don't know which cluster contains the global minima. Is there a method to infer which cluster tends to contain the global minima without a lot of function evaluations?

Matt J on 14 Jun 2016

Open in MATLAB Online

With your test I'm only getting two clusters, i.e. a cluster with points which seems to belong to the same local minima as x and a cluster with points that do not. But how can I create more clusters independent of x?

Let A and B be the 2 clusters and let's say A contains x,y and B contains (z,a,b). Further, suppose that z is the "best" member of set B, meaning it has the best loss function value. You can now sub-cluster B into 2 clusters in a similar way by looking at N-dimensional line segments [z,a] and [z,b] and seeing if they are connected by convex profiles.

Second, I have read the documentation of diff() but I'm a bit unsure about how to use it.

It just takes the differences between adjacent samples of a vector, e.g.,

    >> diff([1,4,5,7])
    ans =
         3     1     2

You can use it to calculate finite difference derivatives of the loss function along your line segments.

Last bust not least, Let's say I'm finding 3 clusters. I don't know which cluster contains the global minima. Is there a method to infer which cluster tends to contain the global minima

You know the loss function values of all points in the clusters, presumably. It seems reasonable to suppose that the cluster containing the best point is the basin of the global minimum.

Sepp on 14 Jun 2016

Edited: Sepp on 14 Jun 2016

With your test I'm only getting two clusters, i.e. a cluster with points which seems to belong to the same local minima as x and a cluster with points that do not. But how can I create more clusters independent of x?

So you are proposing some sort of hierarchical clustering? That means first creating two clusters A and B, then dividing B into B and C, then C into C and D etc.

You can use it to calculate finite difference derivatives of the loss function along your line segments.

How can I do that? Let's say I have the two data points x = (x1,x2,x3) and y = (y1,y2,y3), i.e. each is 3D.

I think I have misunderstood your approach. I need to sample points between x and y,right? I thought that I can just use the points x and y without sampling new points. The problem with sampling new points is that it increases the number of function evaluations a lot which I don't want. x and y are both 3D, how can I sample points in between? It is not just a line (1D).

You know the loss function values of all points in the clusters, presumably. It seems reasonable to suppose that the cluster containing the best point is the basin of the global minimum.

Why? It could be possible that another cluster contains the global minimum but was just badly sampled. I only create a few generations of genetic algorithm, so it is possible that the global minima is in a completely different place than the current best point is.

Matt J on 15 Jun 2016

Edited: Matt J on 15 Jun 2016

Open in MATLAB Online

So you are proposing some sort of hierarchical clustering? That means first creating two clusters A and B, then dividing B into B and C, then C into C and D etc.

Yes.

I think I have misunderstood your approach. I need to sample points between x and y,right? I thought that I can just use the points x and y without sampling new points.

The approach I was proposing was, for x and y and your loss function f(), construct the 1D profile function

g(t) = f(x+t*(y-x))

and sample g(t) at equi-spaced 0<=t<=1. Then look to see if g(t) is convex. If g(t) is convex and sampled into vector

G=[g(0),g(t1),...g(tN) g(1)]

then diff(G) should be monotonically increasing.

The approach was never going to work well if you're averse to sampling your loss function, but I don't think you have much hope of delineating the capture basins if you aren't prepared to do a fair amount of exploratory sampling of the graph of f().

Why? It could be possible that another cluster contains the global minimum but was just badly sampled.

But it's the best estimate of the global min. based on the population that you have. Obviously, the success probability of the test increases with the number of ga() iterations that you do. Any approach you consider will decline in success probability the more thrifty you are with iterations. Suppose in the extreme that you do no iterations at all. Then ga() will have given you no information.

One way that you can refine your estimate of the global min., however, is to look at the global min. of the g(t) profiles while you're calculating them. If you hit a g(t) lower than any point you've seen, you can dynamically revise your estimate of the global min. and its capture basin. For that matter, you can revise your population...

Sign in to comment.

Answer 2

Alan Weiss on 14 Jun 2016

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/289793-how-can-i-cluster-data-points-according-to-the-local-minima-they-belong-to#answer_225467

I wrote a guest post on Loren Shure's blog a while back on clumping solutions found by MultiStart. You could use the same idea here. Basically, if you have a list of N-dimensional points, ordered from best to worst according to their associated fitness function values, then use a variant of the clumpthem function to generate clumps according to the granularity that you want in function value and space.

How can I cluster data points according to the local minima they belong to?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

4 Comments
Show 2 older commentsHide 2 older comments

More Answers (1)

0 Comments
Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

How can I cluster data points according to the local minima they belong to?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

4 Comments Show 2 older commentsHide 2 older comments

More Answers (1)

0 Comments Show -2 older commentsHide -2 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

4 Comments
Show 2 older commentsHide 2 older comments

0 Comments
Show -2 older commentsHide -2 older comments