I met this problem when I was using fmincon(). I try to optimize a function handle at starting point 8, but my function handle has structure "log(exp(-100*x))", this leads to -inf rather than -100*x. How can I overcome this problem?

In my case, I got "Error using barrier (line 22) Objective function is undefined at initial point. Fmincon cannot continue." It seems this problem can simply happen when, say, you do some operations and combinations and get a structure like log(exp(-100*x)). It seems how can I deal with extreme large number?

 Accepted Answer

Face it, your users might be... mistaken... about the limits of computer mathematics, and might choose to include things like power(x^1000,1/1000), which algebraically is just abs(x) for real numbers but in finite precision computing is going to overflow.
Therefore for any user provided function which you have not stripped apart bit by bit and proven the validity of, you should turn on FunValCheck to catch that the user specified something that was unworkable in practice. And you try/catch and you make the user submit something valid instead. Which might require that they think and clean up log(exp(x)) themselves.
If you have the symbolic toolbox, you could also use simplify() on the given expression before you use it, which will catch at least some of the inefficiencies.

3 Comments

I guess I probably don't need to "face it". Never thought that symbolic math TB can also be used in this way. Your suggestion of simplify() works well. Just one more trivial question now I wonder if you can help,
Now I have a function handle something like this, f=@(p1,p2,p3,p4)p1+p2-p3+p4 (this is just an artificial example). Do you have any simple way to translate it to f=@(p)p(1)+p(2)-p(3)+p(4) except using string connection technique along with eval()? Please note that I can't hard code in this way f=f(p)f(p(1),p(2),p(3),p(4)). Although this works, in my case the dimension of p is variant and I can only use the number of dimensions of p, say a double variable n.
If you are using matlabFunction from the Symbolic Toolbox, then look at the 'vars' option, as it allows you to package multiple named variable into a vector argument.

Sign in to comment.

More Answers (2)

Well, since log(exp(-100*x)) is (-100*x), why not just go with that?

4 Comments

I met this problem when I was writing a maximum likelihood estimation program. One step is to add up all the log terms. Also, this is a parametric function with x and parameter to be estimated in fmincon. So I need to use a for loop to add them up. Depending on the complexity of distribution function, sometimes one just gets structure like log(exp()), for instance, if gamma distribution is the one you try to estimate. In short, I guess this can happen when you do some calculations and input some values for multivariate function. In my case, I use, say
f='(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))';
for i=1:n
eval(['L=@(p)L(p)+feval(@(x)log(',f,'),sample(i,:));']);
end
where sample(i,:) is just a scaler
If ‘x’ is positive (and large enough), exp(-100*x) will go to zero, and the log of zero is -Inf. That will give the optimisation routines numerical indigestion.
Yes, I know this. So any ideas to overcome this problem?
This code:
f='(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))';
for i=1:n
eval(['L=@(p)L(p)+feval(@(x)log(',f,'),sample(i,:));']);
end
is really painful to look at! I can’t figure out what you’re doing in it (other than recursively calling your ‘L’ function, that is never a good idea), so I can’t suggest a better way to code it, other than to begin by creating an anonymous function for ‘f’:
f = @(p,x) (x(1).^(p(1)-1).*exp(-x(1)/p(2)))./(p(2)^p(1)*gamma(p(1)));
and then just calling it rather than using eval and feval.
Start by describing what you’re doing there. Then I might be able to suggest an alternative.

Sign in to comment.

First of all, you should almost NEVER optimize a likelihood function.
Always optimize the log of the likelihood. It will be better behaved. If you find an optimum of one, then you find the optimum of the other.
Your problem now goes away. So what is the problem?
Oh, gosh, it looks like you are taking the log of individual terms. However, I am fairly confident that the log of this expression can be written in simple form:
(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))
Start here:
help gammaln
So as it turns out, you are taking the log likelihood, but then not making use of high school algebra. For god's sake, that is why you take the log!
Hint:
log(x(1))*(p(1)-1) - x(1)/p(2) - log(p(2))*p(1) - gammaln(p(1))
So really, where is the problem? You "overcome the problem" by not creating the problem in the first place! This is why you work with the log-likelihood function instead!
I won't even start on the unreadable code fragment that you do show in a comment. (I'm sorry, but the code frag you show looks like MATLAB gone wild. An obscene way to write what should be a simple expression. Whoever gave you that code fragment should be shot, and no reason to wait until dawn.)
You use eval and feval for some completely irrational and obscure reason. This will be incredible inefficient, impossible to read, buggy, & impossible to debug. Also it appears as if you are passing in an entire column (sample(i,:)) into the function, but then only using sample(i,1). Perhaps sample is only a vector, and you don't understand how indexing works. (From reading one of your comments, it appears that sample is indeed a vector.)
Learn to use function handles (properly). The one function handle that you create is inside an eval statement, where you use L twice in a seemingly recursive fashion. Shiver. I tried to read what you had done in that code fragment, but it makes my head hurt.
As a wild guess, this MIGHT do what that code fragment does, as a function handle with p as an argument, using the workspace variable sample indirectly.
fun = @(p ) sum(log(sample)*(p(1)-1) - sample/p(2) - log(p(2))*p(1) - gammaln(p(1)));
I would check it however, if I were you, since I had a terribly hard time reading your fragment.

7 Comments

Thanks John, just to be clear, I used log in my eval() line. In terms of gammaln(), I probably can't employ this function since my code is not hardly coded. I wrote a function that is able to identify variables x(1), x(2)... and parameters to be estimated p(1), p(2)... in a string function. That is, this function takes a string as input such as f='(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))' or maybe a normal distribution like f='(1/(2*pi*sqrt(p(3)*p(4))))*exp(-(((x(1)-p(1))^2/p(3)+(x(2)-p(2))^2/p(4)))/2)'. Because I found very difficult to plug in x into f before using fmincon() to find minimum p(1), p(2), etc., I used both eval() and feval(). Note that this difficulty comes when dealing with multidimensional x and p (x is variable/sample, p is parameter vector). Please let me know if you have other ways to do this.
Let me explain further, I wrote a function called mymle, one input of mymle is a string such as f='(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))'. Then, mymle is able to translate f from string to a function handle, and hence able to plug in x(1), x(2) from a sample set. Now, the likelihood function would only have p as unknown variables. In other words, x is plugged in and now we have something like @(p)(function expression). We then put this function handle into fmincon() to find maximum (p(1), p(2)). Depends on one's distribution function in string variable f, one can get whatever stuff after plug in x.
So, essentially, you have a problem because of the way you get your function. In that case, you will never be able to solve it.
You cannot handle the underflows (and potential overflows) since you cannot control your function. Live with it, since as long as you are forced to use this interface, it will always be a problem.
Sorry, but this is a problem imposed by the way you are getting the function. There is no magical solution if you cannot resolve that.
The solution that I do see is to require the user to supply the log of the expression, NOT the likelihood itself. I.e., if they can type in this:
'(x(1)^(p(1)-1)*exp(-x(1)/p(2)))/(p(2)^p(1)*gamma(p(1)))'
then surely they could have typed in the log of that expression instead. Yes, it may require education. Your users can learn to type gammaln instead of gamma.
If you do not do it that way, then you are trapped. Expect uncontrolled overflows and underflows.
Do you know how FunValCheck work? I checked in document that this option able to check for inf. Is it possible to be a solution?
I don't see that as a solution at all here.
Even if you identify that an underflow/overflow has just occurred, all you can then do is effectively toss out that point. Yet, that point will surely be of value in the MLE.
Maybe you're right. BTW, I just uploaded my code, hoping you understand well.
FunValCheck triggers an error(). It is basically a sanity check to allow you to exit early from a calculation that is already fatally tainted. Your linear and nonlinear constraints should already be blocking out all the places that your objective is invalid, and FunValCheck is to deal with the places you overlooked in your analysis. FunValCheck is not an additional constraint layer to allow you to avoid a particular point: it is a check to say "This calculation is FUBAR, just stop already".

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!