f =
Ezyfit: Single vs. Two-Step Fitting Problem
Show older comments
Ezfit (actually I am using the updated version from Walter Roberson) fails to accurately determine all 4 parameters in a single curve fit (see Method A). A workaround, setting 1 parameter initially and then fitting the remaining 3 in a second step (Method B), looks instead successful.
Is there a method or setting within Ezfit to obtain accurate 4-parameter results in a single fitting run (Method A)?
% (1) Input Data
x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
% (2) Method A: Perform a simple curve fitting with Ezfit
ft1 = ezfit(x,log(y),'log(a*x^(-b) + c*exp(-d*x))','MaxFunEvals', 1e6, 'MaxIter', 1e6); %, 'usega', usega);
array2table([ft1.m(1) ft1.m(2) ft1.m(3) ft1.m(4) (ft1.r)^2],'VariableNames',{'a','b','c','d','R2'})
% a b c d R2
% _____ ______ ______ _____ _______
%
% 56106 2.2236 -85026 49582 0.85991
% (3) Method B: Perform curve fitting by using Ezfit 2 times
% (3.1) With the first Ezfit, find the exponent "b" (among a range of values between 1.5 and 2.5)
% which maximise the coefficient of determination R^2 (I set it as R^2 = 0.99).
% The resulting exponent will be "b_range(idx_max_b) = 1.86".
i = 1;
b_range = 1.5 : 0.01 : 2.5;
for b = b_range
ft2 = ezfit(x,log(y),sprintf('log(a*x^(-%.2f) + c*exp(-d*x))',b),'options',optimset('MaxFunEvals', 1e6, 'MaxIter', 1e6));
ftr(i) = (ft2.r)^2;
i = i + 1;
end
idx_max_b = find(ftr == max(ftr(ftr < 0.99)));
% (3.2) Following the initial determination of the optimal exponent "b" (i.e. "b_range(idx_max_b)"),
% the second Ezfit step finds the remaining parameters
ft3 = ezfit(x,log(y),sprintf('log(a*x^(-%.2f) + c*exp(-d*x))',b_range(idx_max_b)),'options',optimset('MaxFunEvals', 1e6, 'MaxIter', 1e6));
array2table([ft3.m(1) b_range(idx_max_b) ft3.m(2) ft3.m(3) (ft3.r)^2],'VariableNames',{'a','b','c','d','R2'})
% a b c d R2
% ______ ____ _____ _________ _______
%
% 2084.4 1.86 4.806 0.0064196 0.98323
% (4) Plot
hold on
plot(x, y, 'o', 'MarkerFaceColor',[0.7 0.7 0.7],'DisplayName', 'Input Data');
xx = 10 : 10000;
plot(xx, ft1.m(1) .* xx.^(-ft1.m(2)) + ft1.m(3) .* exp(-ft1.m(4) .* xx), 'color','blue', 'Linewidth',2, 'DisplayName', 'Method A');
plot(xx, ft3.m(1) .* xx.^(-b_range(idx_max_b)) + ft3.m(2) .* exp(-ft3.m(3) .* xx), 'color', 'red', 'Linewidth',2, 'DisplayName', 'Method B')
xlabel('x');
ylabel('y');
legend('show');
set(gca, 'XScale', 'log', 'YScale', 'log')

7 Comments
The solution with "fminspeas"
from the File Exchange looks quite acceptable for the original data:
x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
funlist = {@(coef,xdata) xdata.^(-coef(1)),@(coef,xdata) exp(-coef(2)*xdata )};
NLPstart = [1,1];
options = optimset('disp','iter');
[INLP,ILP] = fminspleas(funlist,NLPstart,x,y,[],[],[],options)
hold on
plot(x,ILP(1)*x.^(-INLP(1))+ILP(2)*exp(-INLP(2)*x))
plot(x,y,'*')
hold off
Sim
on 14 Mar 2025
As said: you must decide whether you want to fit log(y) against log(x) using the function or y against x. Both will give different parameters for the fitting process. Usually, fitting y against x is intended.
Say you have the function y = a*x^b as fitting function. Then taking the log on both sides gives log(y) = log(a) + b*log(x), thus a function where log(y) depends linearily on log(x). The latter is numerically simpler, but the results for a and b will differ from the nonlinear fit of y against x.
Sim
on 14 Mar 2025
Accepted Answer
More Answers (2)
Hi @Sim
You can verify this. The model proposed by @Torsten fits quite well within the first 10% of the data range. However, after that point, the model decays much more slowly than the data, although, in theory, it will eventually converge to zero. Inspired by this model, it suggests that you should consider implementing a nonlinear power rate, as illustrated below. You can also find a better a nonlinear function, preferably a bounded one.
format long g
x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
funlist = {@(coef, xdata) xdata.^(-coef(1)*(xdata).^(0.4)), @(coef, xdata) exp(-coef(2)*xdata)};
NLPstart = [1, 1];
options = optimset('disp', 'none');
[INLP, ILP] = fminspleas(funlist, NLPstart, x, y, [], [], [], options)
xf = linspace(x(1), x(end), numel(x)*1e3+1);
yf = ILP(1)*xf.^(-INLP(1)*(xf).^(0.4)) + ILP(2)*exp(-INLP(2)*xf);
%% Plot results
figure
plot(x, y, 'o'), hold on
plot(xf, yf), grid on, hold off
legend('data', 'model')
xlabel('x'), ylabel('y')
set(gca, 'XScale', 'log','YScale', 'log')
figure
subplot(211)
plot(x, y, 'o'), hold on
plot(xf, yf), xlim([0 700]), grid on, hold off
legend('data', 'model')
xlabel('x'), ylabel('y')
title({'Displaying model fitness in interval $x \in [0, 700]$'}, 'interpreter', 'latex')
subplot(212)
plot(x, y, 'o'), hold on
plot(xf, yf), xlim([700 7000]), grid on, hold off
legend('data', 'model')
xlabel('x'), ylabel('y')
title({'Displaying model fitness in interval $x \in [700, 7000]$'}, 'interpreter', 'latex')
3 Comments
Sam Chak
on 14 Mar 2025
@Sim: The short answer is "I do not know". The long answer follows. Initially, I used (x,y) to (x,f(x,p)), but the resulting fit did not look good to me.
Hi Sim,
There is no simple model for that type of curve. However, since the curve exhibits a form of exponential decay that asymptotically converges to zero, I'd suggest an exponential function-based model:

where
is a nonlinear power rate function and
is a polynomial function. If unsatisfactory, I'd add a reciprocal function term as @Torsten did, in this form
to add some controllability to the rate of decay. For simplicity, you can try:
is a nonlinear power rate function and Nevertheless, you should consult @Alex Sha, as he is very knowledgeable in curve fitting and optimization and possesses an extensive library of nonlinear functions.
Hi, all, if add one more parameter to the original fitting function, i.e. y=ln(a*x^(-b)-c*exp(-d*x))*f, the result will be improved like:
Sum Squared Error (SSE): 62.0296923820521
Root of Mean Square Error (RMSE): 0.869746896054109
Correlation Coef. (R): 0.970745675455549
R-Square: 0.94234716641565
Parameter Best Estimate
--------- -------------
a 0.97506498591964
b -0.0034886080404169
c 1.00586174849318
d 0.0069594641193838
f -5.56698125787981

4 Comments
Hi @Alex Sha
I believe that your solution deserves recognition. It appears that @Sim wants the fitting model to follow the trend of the data distribution on a base-10 logarithmic scale for both the x-axis and the y-axis.
In fact, @Sim is open to suggestions for other more appropriate power law or exponential-based functions, where its derivative should eventually converge to zero, or at least to a very small negative value at x = 7000.
%% Data
x = [6, 16, 26, 36, 46, 56, 66, 66, 76, 86, 96, 126, 176, 186, 246, 286, 306, 336, 366, 396, 426, 486, 496, 506, 546, 546, 616, 656, 726, 756, 806, 816, 836, 946, 966, 976, 1056, 1066, 1116, 1136, 1176, 1296, 1326, 1386, 1426, 1456, 1476, 1586, 1596, 1736, 1806, 1836, 1846, 1886, 2016, 2046, 2106, 2196, 2296, 2346, 2356, 2366, 2556, 2666, 2806, 2856, 2916, 2976, 3116, 3266, 3366, 3416, 3526, 3726, 3876, 3976, 4386, 4636, 4686, 5246, 5346, 6966];
y = [22.932, 13.063, 10.877, 8.7408, 7.4074, 5.8344, 12.693, 4.8185, 3.5444, 2.6637, 2.2515, 4.5284, 1.7597, 1.5207, 1.1754, 0.74499, 0.75637, 0.28157, 0.35862, 0.37542, 0.27759, 0.25008, 0.11389, 0.23569, 0.16777, 0.084717, 0.11865, 0.03168, 0.085227, 0.042803, 0.038632, 0.016028, 0.049869, 0.034689, 0.025157, 0.0064107, 0.017007, 0.0067062, 0.012242, 0.0065558, 0.014677, 0.0048147, 0.0048443, 0.0045972, 0.0059391, 0.0024968, 0.0050464, 0.0024622, 0.0053329, 0.0038574, 0.0035999, 0.0022632, 0.00147, 0.00098146, 0.0012814, 0.0024717, 0.00073639, 0.00073862, 0.0013448, 0.00088996, 0.00096546, 0.00084, 0.00062343, 0.0014789, 0.00050959, 0.00027922, 0.00055839, 0.0019602, 0.00098146, 0.00020941, 0.00028421, 0.00075468, 0.00031831, 0.00055524, 0.00055844, 0.00039789, 0.00031207, 0.00072048, 0.00021507, 0.0004718, 0.00018294, 0.00018086];
%% Alex Sha's New Global Solution
a = 0.97506498591964;
b = -0.0034886080404169;
c = 1.00586174849318;
d = 0.0069594641193838;
f = -5.56698125787981;
%% Model
xf = linspace(x(1), x(end), numel(x)*1e3+1);
yf = log(a*xf.^(-b) - c*exp(-d*xf))*f;
%% Plot results
figure
subplot(211)
plot(x, y, 'o'), hold on
plot(xf, yf), xlim([0 700]), grid on
legend('data', 'model')
xlabel('x'), ylabel('y')
title({'Displaying model fitness in interval $x \in [0, 700]$'}, 'interpreter', 'latex')
subplot(212)
plot(x, y, 'o'), hold on
plot(xf, yf), xlim([700 7000]), grid on
legend('data', 'model')
xlabel('x'), ylabel('y')
title({'Displaying model fitness in interval $x \in [700, 7000]$'}, 'interpreter', 'latex')
figure
plot(x, y, 'o'), hold on
plot(xf, yf), grid on, hold off
legend('data', 'model')
xlabel('x'), ylabel('y')
set(gca, 'XScale', 'log','YScale', 'log')
Alex Sha
on 17 Mar 2025
Not sure whether or not my understanding is correct, if want fitting result good in log() scale, is it reasonab to change y data to y1=log(y) firstly, then the fitting function become "y1=log(a*x^(-b)-c*exp(-d*x))", the result will be:
Sum Squared Error (SSE): 3.16256795387294
Root of Mean Square Error (RMSE): 0.196387122481336
Correlation Coef. (R): 0.991266174106857
R-Square: 0.982608627928446
Parameter Best Estimate
--------- -------------
a 10.6590120083535
b 0.69245173591868
c -1.65676470503943
d 0.00313457062827038

if taking function as "y1=f*log(a*x^(-b)-c*exp(-d*x))"
Sum Squared Error (SSE): 2.95997975085014
Root of Mean Square Error (RMSE): 0.189992931538933
Correlation Coef. (R): 0.99182732299325
R-Square: 0.983721438635958
Parameter Best Estimate
--------- -------------
f 4.78460068319797
a 0.918658120426212
b 0.0761834706279075
c -0.540969703020373
d 0.00132426521534454

Categories
Find more on Linear Predictive Coding in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!








