Main Content

Solve Nonlinear Problem with Many Variables

This example shows how to handle a large number of variables in a nonlinear problem. In general, for this type of problem:

  • Use the Low-memory BFGS (LBFGS) Hessian approximation. This option is available in the default fminunc and fmincon algorithms.

  • If you have an explicit gradient, you can also use a finite-difference Hessian and the 'cg' subproblem algorithm.

  • If you have an explicit Hessian, formulate the Hessian as sparse.

  • Although not part of this example, if you have a structured problem and can evaluate the product of the Hessian with an arbitrary vector, try using a Hessian multiply function. See Minimization with Dense Structured Hessian, Linear Equalities.

The example uses the hfminunc0obj helper function shown at the end of this example for the general nonlinear solvers fminunc and fmincon. This function is an N-dimensional generalization of Rosenbrock's function, a difficult function to minimize numerically. The minimum value of 0 is attained at the unique point x = ones(N,1).

The function is an explicit sum of squares. Therefore, the example also shows the efficiency of using a least-squares solver. For the least-squares solver lsqnonlin, the example uses the hlsqnonlin0obj helper function shown at the end of this example as a vector objective function that is equivalent to the hfminunc0obj function.

Problem Setup

Set the problem to use the hfminunc0obj objective function for 1000 variables. Set the initial point x0 to –2 for each variable.

fun = @hfminunc0obj;
N = 1e3;
x0 = -2*ones(N,1);

For the initial option, specify no display and no limit to the number of function evaluations or iterations.

options = optimoptions("fminunc",Display="none",...
    MaxFunctionEvaluations=Inf,MaxIterations=Inf);

Set up a table to hold the data. Specify labels for the eight solver runs, and define places to collect the run time, returned function value, exit flag, number of iterations, and time per iteration.

ExperimentLabels = ["BFGS_NoGrad", "LBFGS_NoGrad",...
    "BFGS_Grad", "LBFGS_Grad", "Analytic", "fin-diff-grads",...
    "LSQ_NoJacob", "LSQ_Jacob"];
timetable = table('Size',[8 5],'VariableTypes',["double" "double" "double" "double" "double"],...
    'VariableNames',["Time" "Fval" "Eflag" "Iters" "TimePerIter"],...
    'RowNames',ExperimentLabels);

The following code sections create the appropriate options for each solver run, and collect the output directly into the table whenever possible.

BFGS Hessian Approximation, No Gradient

opts = options;
opts.HessianApproximation = 'bfgs';
opts.SpecifyObjectiveGradient = false;
overallTime = tic;
[~,timetable.Fval("BFGS_NoGrad"),timetable.Eflag("BFGS_NoGrad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("BFGS_NoGrad") = toc(overallTime);
timetable.Iters("BFGS_NoGrad") = output.iterations;
timetable.TimePerIter("BFGS_NoGrad") =...
    timetable.Time("BFGS_NoGrad")/timetable.Iters("BFGS_NoGrad");

LBFGS Hessian Approximation, No Gradient

opts = options;
opts.HessianApproximation = 'lbfgs';
opts.SpecifyObjectiveGradient = false;
overallTime = tic;
[~,timetable.Fval("LBFGS_NoGrad"),timetable.Eflag("LBFGS_NoGrad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("LBFGS_NoGrad") = toc(overallTime);
timetable.Iters("LBFGS_NoGrad") = output.iterations;
timetable.TimePerIter("LBFGS_NoGrad") =...
    timetable.Time("LBFGS_NoGrad")/timetable.Iters("LBFGS_NoGrad");

BFGS with Gradient

opts = options;
opts.HessianApproximation = 'bfgs';
opts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("BFGS_Grad"),timetable.Eflag("BFGS_Grad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("BFGS_Grad") = toc(overallTime);
timetable.Iters("BFGS_Grad") = output.iterations;
timetable.TimePerIter("BFGS_Grad") =...
    timetable.Time("BFGS_Grad")/timetable.Iters("BFGS_Grad");

LBFGS with Gradient

opts = options;
opts.HessianApproximation = 'lbfgs';
opts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("LBFGS_Grad"),timetable.Eflag("LBFGS_Grad"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("LBFGS_Grad") = toc(overallTime);
timetable.Iters("LBFGS_Grad") = output.iterations;
timetable.TimePerIter("LBFGS_Grad") =...
    timetable.Time("LBFGS_Grad")/timetable.Iters("LBFGS_Grad");

Analytic Hessian, 'trust-region' Algorithm

opts = options;
opts.Algorithm = 'trust-region';
opts.SpecifyObjectiveGradient = true;
opts.HessianFcn = "objective";
overallTime = tic;
[~,timetable.Fval("Analytic"),timetable.Eflag("Analytic"),output] =...
    fminunc(fun, x0, opts);
timetable.Time("Analytic") = toc(overallTime);
timetable.Iters("Analytic") = output.iterations;
timetable.TimePerIter("Analytic") =...
    timetable.Time("Analytic")/timetable.Iters("Analytic");

Finite-Difference Hessian with Gradient, fmincon Solver

opts = optimoptions("fmincon","SpecifyObjectiveGradient",true,...
    "Display","none","HessianApproximation","finite-difference",...
    SubproblemAlgorithm="cg",MaxFunctionEvaluations=Inf,MaxIterations=Inf);
overallTime = tic;
[~,timetable.Fval("fin-diff-grads"),timetable.Eflag("fin-diff-grads"),output] =...
    fmincon(fun, x0, [],[],[],[],[],[],[],opts);
timetable.Time("fin-diff-grads") = toc(overallTime);
timetable.Iters("fin-diff-grads") = output.iterations;
timetable.TimePerIter("fin-diff-grads") =...
    timetable.Time("fin-diff-grads")/timetable.Iters("fin-diff-grads");

Least Squares, No Jacobian

lsqopts = optimoptions("lsqnonlin","Display","none",...
    "MaxFunctionEvaluations",Inf,"MaxIterations",Inf);
fun = @hlsqnonlin0obj;
overallTime = tic;
[~,timetable.Fval("LSQ_NoJacob"),~,timetable.Eflag("LSQ_NoJacob"),output] =...
    lsqnonlin(fun, x0, [],[],lsqopts);
timetable.Time("LSQ_NoJacob") = toc(overallTime);
timetable.Iters("LSQ_NoJacob") = output.iterations;
timetable.TimePerIter("LSQ_NoJacob") =...
    timetable.Time("LSQ_NoJacob")/timetable.Iters("LSQ_NoJacob");

Least Squares with Jacobian

lsqopts.SpecifyObjectiveGradient = true;
overallTime = tic;
[~,timetable.Fval("LSQ_Jacob"),~,timetable.Eflag("LSQ_Jacob"),output] =...
    lsqnonlin(fun, x0, [],[],lsqopts);
timetable.Time("LSQ_Jacob") = toc(overallTime);
timetable.Iters("LSQ_Jacob") = output.iterations;
timetable.TimePerIter("LSQ_Jacob") =...
    timetable.Time("LSQ_Jacob")/timetable.Iters("LSQ_Jacob");

Examine Results

disp(timetable)
                       Time        Fval       Eflag    Iters    TimePerIter
                      ______    __________    _____    _____    ___________

    BFGS_NoGrad       110.44    5.0083e-08      1      7137       0.015475 
    LBFGS_NoGrad      53.143     2.476e-07      1      4902       0.010841 
    BFGS_Grad         35.491    2.9865e-08      1      7105      0.0049952 
    LBFGS_Grad        1.2056    9.7505e-08      1      4907      0.0002457 
    Analytic          7.0991     1.671e-10      3      2301      0.0030852 
    fin-diff-grads     5.217    1.1422e-15      1      1382       0.003775 
    LSQ_NoJacob       94.708    3.7969e-25      1      1664       0.056916 
    LSQ_Jacob         6.5225    3.0056e-25      1      1664      0.0039197 

The timing results show the following:

  • For this problem, the LBFGS Hessian approximation with gradients is the fastest by far.

  • The next fastest solver runs are fmincon with a finite difference of gradients Hessian, trust-region fminunc with analytic gradient and Hessian, and lsqnonlin with analytic Jacobian.

  • The fminunc BFGS algorithm without gradient has similar speed to the lsqnonlin solver without Jacobian. Note that lsqnonlin requires many fewer iterations than fminunc for this problem, but each iteration takes much longer.

  • Derivatives make a large difference in speed for all solvers.

Helper Functions

The following code creates the hfminunc0obj helper function.

function [f,G,H] = hfminunc0obj(x)
% Rosenbrock function in N dimensions
N = numel(x);
xx = x(1:N-1);
xx_plus = x(2:N);
f_vec = 100*(xx.^2 - xx_plus).^2 + (xx - 1).^2;
f = sum(f_vec);
if (nargout >= 2) % Gradient
    G = zeros(N,1);
    for k = 1:N
        if (k == 1)
            G(k) = 2*(x(k)-1) + 400*x(k)*(x(k)^2-x(k+1));
        elseif (k == N)
            G(k) = 200*x(k) - 200*x(k-1)^2;
        else
            G(k) = 202*x(k) - 200*x(k-1)^2 - 400*x(k)*(x(k+1) - x(k)^2) - 2;
        end
    end
    if nargout >= 3 % Hessian
        H = spalloc(N,N,3*N);
        for i = 2:(N-1)
            H(i,i) = 202 + 1200*x(i)^2 - 400*x(i+1);
            H(i,i-1) = -400*x(i-1);
            H(i,i+1) = -400*x(i);
        end
        H(1,1) = 2 + 1200*x(1)^2 - 400*x(2);
        H(1,2) = -400*x(1);
        H(N,N) = 200;
        H(N,N-1) = -400*x(N-1);
    end
end
end

The following code creates the hlsqnonlin0obj helper function.

function [f,G] = hlsqnonlin0obj(x)
% Rosenbrock function in N dimensions
N = numel(x);
xx = x(1:N-1);
xx_plus = x(2:N);
f_vec = [10*(xx.^2 - xx_plus), (xx - 1)];
f = reshape(f_vec',[],1); % Vector of length 2*(N-1)
% Jacobian
if (nargout >= 2)
    G = spalloc(2*(N-1),N,3*N); % Jacobian size 2*(N-1)-by-N with 3*N nonzeros
    for k = 1:(N-1)
        G(2*k-1,k) = 10*2*x(k);
        G(2*k-1,k+1) = -10;
        G(2*k,k) = 1;
    end
end
end

See Also

|

Related Topics