Is 'bayesopt' such a poor optimizer or am I doing something wrong?

I am trying to fit a gauss data points with exactly the same model that produced the data. The model has just 4 integer parameters. Yet 'bayesopt' is not able to find them even if I choose InitialX very close to the parameters that generated data.
I am lost. Thank you. :-)
clc
N=300;
% - let's create 300 points with gaussian shape
x = linspace(0,10,N)';
b = [1;3;2;2]; % model parameters
mf=@modelfun
y = mf(b,x);
% - let's create search parameters
num(1) = optimizableVariable('base', [0,9],'Type','integer');
num(2) = optimizableVariable('height', [0,9],'Type','integer');
num(3) = optimizableVariable('location',[0,9],'Type','integer');
num(4) = optimizableVariable('width', [0,9],'Type','integer');
% - define loss function
bf=@(num)bayesfun(num,x,y);
% - call bayesopt with initial parameters close to model values
results = bayesopt(bf,num,...
'InitialX',table(2,2,1,3));
b0=table2array(results.bestPoint);
results.bestPoint
% let's see plot of model data and fit data
%------------------------
plot(x,y,'.b')
hold on
plot(x,mf(b0,x),'.r')
hold off
legend('y','fit')
function out=bayesfun(t,x,y)
y1=modelfun(table2array(t),x);
out=sum((y1-y).^2);
end
function out=modelfun(b,x)
out = b(1);
out= out + b(2)*exp(-((b(3)-x)/b(4)).^2);
end

 Accepted Answer

I think it's actually a hard problem to find the global minimum of this function. I wonder if you have tried any other optimizers to see how they compare. I tried random search and gridsearch (sampling without replacement on a grid), for 100 iterations, and bayesopt did a lot better than those.
There are a few things you can do to improve bayesopt in this case.
(1) Tell bayesopt that the function is deterministic by passing 'IsObjectiveDeterministic',true.
(2) Run it for more evaluations, e.g., 'MaxObjectiveEvaluations',100.
(3) Compress your objective function so it doesn't span many orders of magnitude by passing it through log1p. See modified code below.
When I did these 3 things, it did somewhat better and occasionally found the 0 point.
I suspect that the 0 point is in a hole without enough of a clue around it for the GP model to latch on to. I'm curious to hear how other optimizers perform on this problem.
clc
N=300;
% - let's create 300 points with gaussian shape
x = linspace(0,10,N)';
b = [1;3;2;2]; % model parameters
mf=@modelfun
y = mf(b,x);
% - let's create search parameters
num(1) = optimizableVariable('base', [0,9],'Type','integer');
num(2) = optimizableVariable('height', [0,9],'Type','integer');
num(3) = optimizableVariable('location',[0,9],'Type','integer');
num(4) = optimizableVariable('width', [0,9],'Type','integer');
% - define loss function
bf=@(num)bayesfun(num,x,y);
% - call bayesopt with initial parameters close to model values
results = bayesopt(bf,num, 'maxobj',100, 'isobj',true);
b0=table2array(results.bestPoint);
results.bestPoint
original_objective = exp(bayesfun(results.bestPoint,x,y)) - 1 % Undo log1p
% let's see plot of model data and fit data
%------------------------
plot(x,y,'.b')
hold on
plot(x,mf(b0,x),'.r')
hold off
legend('y','fit')
function out=bayesfun(t,x,y)
y1=modelfun(table2array(t),x);
out=sum((y1-y).^2);
out = log1p(out);
end
function out=modelfun(b,x)
out = b(1);
out= out + b(2)*exp(-((b(3)-x)/b(4)).^2);
end

3 Comments

Thank you for your prompt response.
I was testing your modification meanwhile. It is true that increasing itteration number helps finding better solution. But the bayesopt is so slow that I prefer to keep the default number. My intenstion wss to test performance of the function anyway.
The IsObj and Log-scalling Loss funcion did not produce noticebale change. At least not for me. I would expect some effects from IsObj but I am not sure why Log-scalling would help. Actually, if the 0 point area is flat then Log-scaling will flatten it even further.
I have not tried other methods yet because I was sure that I was doing with bayesopt something fundamentaly wrong. :-) But if it is not function usage what is a problem then definitelly I going to look around for alternatives.
Once again, thank you very much.
On the topic of bayesopt being slow and inaccurate, there is a nice table here describing the characteristics of problems that bayesopt is best suited for.
It's best for expensive objective functions, and when you don't need to find the exact best solution. It's slow (high overhead per iteration) because it fits a GP model on every iteration, so it's only going to be worth it if the objective function is even slower. And because it's trying to find the global minimum, it doesn't spend too much time homing in on particular points to get high accuracy. It has a tendency to explore more than local optimizers do.
Perfect comments!
They also explain why I was quite happy about bayesopt in the past when I was optimizing hyperparameters of larger autostacked encoder. :-)
Thank you. You really helped to fill my knowledge gap about the bayesopt usage.

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!