fmincon non-linear optimisation: issues with sqp and interior-point

I am running an optimisation problem using fmincon.
The problem is subjected to very few bound constraints, linear inequality constraints and non-linear inequality constraints.
There are also a lot or non-linear equality constraints. These are the dynamics of the problem imposed as defect constraints using the direct transcription formulation.
I am trying to find the optimal control using multiple shooting and normalised NLP variables.
I was doing some test as a trade-off between SQP and interior-point and I noticed some issues with both methods.
1) SQP:
As the documentation highlights, a change to the NLP vector is applied if the computed shift improves the objective function.
My issue with this is that SQP tends to improve "too much" the objective function, hence converging towards an infeasible solution.
Basically, it lacks the tools to make a step back when overshoots, and frequently gets stucked into some resonance region where the solution cannot improves in terms of feasibility.
This can usually be compensated by iteratively bounding the NLP variables towards its optimal solution, but requires many runs.
The main advantage of this approach is that the convergence, when it works, is very fast.
2) Interior-point:
This case is much trickier as the interior-point appears to be a more complex algorithm.
From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
The main issue I'm having with interior-point is that, if I start from a solution very close to the optimal one, it just ignores it and start all over.
Note that in the following example, the objective function is bounded to a minimum of -1.
First-order Norm of
Iter F-count f(x) Feasibility optimality step
0 443 -9.920000e-01 1.890e-06 7.998e-03 <----------- Near-optimal initial guess
1 890 -9.816996e-01 1.360e+00 1.830e-02 1.999e+00
2 1334 -9.816996e-01 1.360e+00 1.830e-02 2.146e-12
3 1781 -9.214377e-01 1.097e+00 2.277e-01 3.600e+00
4 2225 -9.214377e-01 1.097e+00 1.780e-01 1.720e-10
5 2670 -9.205512e-01 1.251e+00 4.427e-01 2.729e-01
6 3113 -9.236640e-01 1.029e+00 2.175e-01 1.249e+00
................................................................
I understand is a bit silly to ask for help and sharing so little information, but maybe someone is already familiar with these issues and knows some workaround.

2 Comments

From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
Was this a typo? If it takes 10x the iterations, that makes it sound slower than sqp. In what way then does it have a "larger convergence rate"?
Interior-point is slower than SQP but, out of 100 trials, would converge more times than SQP.

Sign in to comment.

 Accepted Answer

Matt J
Matt J on 5 Jul 2023
Edited: Matt J on 5 Jul 2023
SQP...Basically, it lacks the tools to make a step back when overshoots
On the contrary, it is one of the few algorithms that can backtrack, but you have to set the objective and/or constraints to NaN or Inf in those regions, see Robustness to Non-Double Results.
Interior-Point...From my tests it appears to have a larger convergence rate than SQP but requires x10 iterations.
Are you supplying an analytic Hessian computation? If not, that is probably why it is slow.

9 Comments

What you mention about SQP is not much related to my problem.
The convergence works fine but the objective function is prioritise over the constraints, thus convergin to an infeasible point.
For interior-point I am fine with the computional time. Analytic Hessian are basically impossible to provide with direct transcription.
My issue is only about the immediate devergence from the initial guess.
What you mention about SQP is not much related to my problem.
It is related. If you set the Objective/Constraints to Inf in the regions where you don't want it to go, i.e., where the iterations are prone to get stuck, then the back-tracking mechanism of SQP will try to avoid them.
My issue is only about the immediate devergence from the initial guess.
That happens because your initial guess is not an interior point. Therefore, the interior-point method cannot use it. If you have a way of choosing an initial guess that is an interior point, that should resolve that aspect of the problem. But, if you cannot compute the Hessian analytically, it may not be wise to pursue interior-point. Inaccurate finite difference approximations of the Hessian could be one reason why you find that it takes many more iterations to converge than SQP.
Your second point is very interesting. I guess there is no wayt to verify if the initial guess is an interior-point a priori.
For SQP on the other hand, it sounds easy on paper but is impossible iin practice.
The unfeasible region is dictated by highly nonlinear dynamics constraints that are impossible to bound.
Consider for example a simle single shooting. i impose the dynamics as end-propagation constraints and i want to maximise the final outcome.
SQP will overshoot it to maxmise the objective function (minimise the negative), and won't be able to converge because it cannot reduce it anymore. Although your solution would be suitable, is impossible to bound some non-linear constraints a priori.
The best approach I have found is to limit iteratively the objective function to avoid SQP overshooting.
It's a trial and error approach where I converge to the optimal solution with the algorithm.
Your second point is very interesting. I guess there is no wayt to verify if the initial guess is an interior-point a priori.
Certainly there is. Evaluate your inequality constraints at x0, and see if they are satisfied strictly.
The unfeasible region is dictated by highly nonlinear dynamics constraints that are impossible to bound.
I'm not sure what that means, or why it is relevant. Why would you need to "bound" the region, in order to set the objective or nonlinear constraint functions to infinity there? Given a point x, surely you can test whether it lies in the unfeasible region or not. That's what your nonlinear constraint functions already serve to do.
Interior-point is clear now, thank you.
For SQP I'm sure I didn't understand correctly.
I'll try to explain what I am getting from your comment so it can be easier for you to articulate.
I have a multiple shooting where the state variables are discretised on a grid.
The dynamics f(x) are imposed as defect constraints such that if I have a state x(i) at grid point i, the feasibility is given by:
defect = f(x(i-1),dt) - x(i)
Where f(x(i-1),dt) is the value of the state variable when propagated from the initial conditions x(i-1) for a time interval dt, where dt is the time between consectuvie grid points.
To enforce feasibility i shall define the constraint as:
defect = f(x(i-1),dt) - x(i);
if defect > optimalityTolerance
defect = Inf;
end
This way SQP is forced to find a feasible region.
The reason why I am sure this is wrong is that SQP has no real way of computing the derivatives numerically if I impose them to be always Inf.
What am I missing here?
The reason why I am sure this is wrong is that SQP has no real way of computing the derivatives numerically if I impose them to be always Inf.
Why would it always be inf? You made it sound like you had a way of starting the SQP iterations in a safe region, but the problem happens when the iterations overshoot and stray outside that region. Within the safe region, everything should be non-inf. The only regions that should be set to inf are ones where you know SQP won't be able to recover from if it were to iterate there normally.
Okay but this was my point in here:
The best approach I have found is to limit iteratively the objective function to avoid SQP overshooting.
It's a trial and error approach where I converge to the optimal solution with the algorithm.
I think the method I used, in certain ways, is based on the same principle you are highlighting.
The main problem is that I don't know a priori the unsafe region, but I can recover it through some tuning.
What I don't like is that this usually requires many iterations and I would like some way to do this automatically.
Consider the simplified optimisation problem where I model all the muscles in my arm as a dynamic system, with some constraints on applied forces, momenta, joints angles, etc.
I want to calculate the maximum height I can throw a ball of a certain mass.
What SQP does is to overshoot this height falling into an unfeasible region and, unlike IP, is not able to take a step back, as per documentation.
My approach of bounding the maximum height of the ball, is very similar to use Inf to exclude the unfeasible region. The problem is that I don't know it a priori:
I do an iteration bounding up to an educated guess.
I see what's height is reached.
I re-adapt the constraint.
Repeat until convergence.
The general outcome is that fmincon gets stucked into some values of objective function and feasibility, clearly from overshooting, and keeps going on with pointless iterations.
I was also thinking about adding a penalty function depending on the feasibility to the objective function, but seems artificious.
OK, I think I understand. However, it doesn't appear to me that your originally posted question has any remaining unanswered components. It would appear that for your particular problem, you have difficult local solutions outside the feasible set that the optimization cannot come back from, and therefore it is critical that you choose an algorithm that remains within the feasible set. Assuming all that is true, IP seems like a pretty good choice.
You have speculated that SQP would serve you better in terms of speed if it could be prevented from straying outside the feasible set, but I am skeptical of that. The whole reason that IP is slow seems to be that it is confined to move through the feasible region, and so must traverse a less direct path to the solution. If you confined SQP in the same way, I imagine it would also be slow, and for the same reasons.
You have speculated that SQP would serve you better in terms of speed if it could be prevented from straying outside the feasible set, but I am skeptical of that. The whole reason that IP is slow seems to be that it is confined to move through the feasible region, and so must traverse a less direct path to the solution. If you confined SQP in the same way, I imagine it would also be slow, and for the same reasons.
I agree on everything.
The computational difference is probably due to the difficulty of selecting an interior point as initial guess, which SQP does not care about.
Thank you for your time, I had really helpful clarifications.

Sign in to comment.

More Answers (1)

It sounds like you have done a good job in analyzing the solver behavior. There is one more thing that I have found that sometimes helps the sqp algorithm converge better: multiply the nonlinear equality constraint functions by appropriate constants.
The idea is this. fmincon is trying to satisfy ceq(x) = 0. However, at the same time it is trying to minimize the objective function. If, as you say, it is minimizing the objective function at the expense of feasibility, then try multiplying the various components of ceq(x) by 10 or 100 or even 1000. That will encourage fmincon to pay more attention to feasibility.
This procedue is, of course, quite manual, and can be tricky to balance, meaning it can lead to undesirable behavior such as slow convergence or even divergence. Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation

1 Comment

Thank you Alan for the insight, but I have to admit I have already tried this one and it didn't work for this particular problem.
From my experience, it is helpful when some constraints are a bit harder to satisfy than others, so you can put some weight with a coefficient, e.g.:
ceq(10:20) = 1e5*ceq(10:20);
But for SQP the algorithm didn't care at all about the order of magnitude of the feasibility, exhibiting the same behavior described above.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!