Converging on local minima using ga(). Ways to increase odds of finding global minima.

3 views (last 30 days)
I am trying to determine 25 variables for a process using MSE as an objective function. My variables are limited by bounds, therefore ga() defaults to 'mutationadaptfeasible' as the mutation function. From my knowledge of more simple mutation functions, such as uniform mutation, one can vary the degree of mutation to avoid converging on local minima. From my research, this does not seem doable with 'mutationadaptfeasible'? Please correct me if this is not the case.
The other way of not being so prone to find local minima is to increase the population size. I am currently using between 100 and 200, but it does not seem enough, as the algorithm seems to converge to wildly different solutions with large differences in MSE. The objective function is computationally intensive, as it involves simulations. It is not a simple equation. Currently I can generate one solution per CPU core per day using a population size of 100-200 and 50-100 generations. Simply increasing the population size to 1000 or more would not be ideal, as generating one set of process variables would take a week or more.
I should also note that the chosen bounds do not have to be absolutely followed. The simulations have a tendency to become unstable and crash / give error if certain variables are 10x larger or smaller than what the bounds suggest.
Any ideas how I should proceed? I am currently looking into the other algorithms offered in the global optimization and optimization toolboxes. Another possibility is coding a custom mutation function that would have a variable chance of randomly changing a variable within their bounds.
Lastly, I am interested in finding the global minimum, but a local minimum that is very close to the global is more than enough. My problem currently is that the local minima I find can sometimes be good and othertimes horrible.

Answers (1)

John D'Errico
John D'Errico on 8 Sep 2023
So big problems take time. Should this be a surprise? More time invested can be a gain, since here it will improve the ability of GA to sufficiently sample the search space and thus to converge. Again, no surprise. And the same applies to other optimizers in the Global Optimization TB. You might try them out. Some may work better on your problem. With no specifics offered by you, there is no way to know.
That GA sometimes yields a good result, and sometimes a crappy one is just an indicator that you need to give any variant of global optimizer larger populations, and more time to find a solution.
Unfortunately, your constraints seem to be fuzzy things. You might see some serious gains by formulating better sets of constraints, to avoid evaluating the objective at points that would clearly cause failure. I say this because you talk about fuzzy bounds that vary by a factor of 10.
You might also see serious gains in how you perform the function evaluation in the first place. Again, your question is far too vague to know. But whenever I see someone say that have code that takes a day to run, for one function evaluation (on one core) then the biggest question in my mind is how well written is that code? My point is, I can show codes I wrote over the years to do one computation. As I thought of ways to improve what I had written, I wrote a new version. And each time, these versions were often an order of magnitude faster. So you may think your code is written as well as it can be, but I'd bet it is not.
In the end, your question is just too vague. You are already using parallel tools to make the best use of your CPU. Again, my suggestions would be to focus on improving the speed of your code to do the function evaluation. If this is a simulation as you say, then find better ways to do that. Next, I would look at finding ways to improve your constraints, rather than just highly fuzzy bounds. And the very last thing I would worry about is the specific optimizer. If you can reduce the search space greatly, then GA, or any optimzer, will work hugely better. If you can reduce the time required for a function eval, then you can use larger population sizes, etc. In either case, you will improve your ability to get a good result.
  2 Comments
RaFa
RaFa on 8 Sep 2023
Edited: RaFa on 11 Sep 2023
Some more info and explanations to the points you made (the order is a bit of a mess, sorry!)
Indeed, big problems take time. I have access to a data cluster which I will utilize further down the road in my work. Currently, my main goal is to create a script that can more reliably find good solutions. I will later run jobs in parallel on the data cluster, given that I can "trust" the result. Rerunning the script multiple times and get a set of candidate solutions to choose from is one approach I have considered.
My problem statement is intentionally vague, I err on the side of caution on specifics as this is a novel problem and approach. Mainly looking to expand my knowledge in all that MATLAB has to offer, specifically related to optimization algorithms.
Function evaluation is something that is constantly being iterated upon to better the algorithms ability to find relevant candidate solutions. I can say this much that it involves solving multiple stiff ODEs each function evaluation, and I am not sure I am able to further optimize it. Atleast not by many factors. The choice of ODE solver and variable bounds have been chosen so that it doesn't get stuck and one "solve" might take 0.5 s on average. Example to confirm what I earlier stated about time to find a solution:
0.5 seconds * 15 solves per feval * 200 individuals in pop * 50 generations = 75000 seconds = 20.83... hours
The constraints are indeed fuzzy and for this problem, the larger the search area (for certain variables) the better, as the bounds only exist to not get the ODE solver stuck.
Exactly as you mention in the end, the last thing I want to worry about is the optimization algorithm (I have otherwise great experience with using ga() and gamultiobj() for smaller problems, hence why I turned to them first). Given that I am not an expert in the field of optimization I thought that I might have missed something obvious, but it looks like I have not, atleast regarding GAs.
Are there perhaps more sophisticated optimization algorithms which require less function evaluations? GAs might not be suitable for this problem as the function evaluation time is very large, and GAs tend to require many to converge. I am currently in the process of testing Global Search. Is Pattern Search or Particle Swarm worth testing for this kind of problem, in your experience (with my limited problem information)?
Thanks for the comments.
RaFa
RaFa on 8 Sep 2023
Edited: RaFa on 8 Sep 2023
Oh and another thing relating to certain variables. Now this cannot currently be confirmed, but it looks like their behavior is sort of logarithmic and I am equally interested in having candidates between, lets say, 10^6 and 10^7 as I am between 10^10 and 10^11.
Therefore, I have adapted the initialization so that I generate the initial population for these specific variables loguniformly. Below is an example figure of how the distribution looks for these variables (imagine the x-axis value goes from about 10^5 to 10^20).
The question is now, is this approach sound? Or is it better to converting these variables before optimization to their log counterpart, so that a change from, lets say, 10^5 to 10^6 in a specific variable would instead be 5 to 6.
I mainly am interested in if my current approach is messing with any part of the GA, such as crossover, mutation etc. and hindering the algorithms ability to find good solutions.

Sign in to comment.

Products


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!