nonlinear fit of experimental data
7 views (last 30 days)
Show older comments
Dear MatLab Experts,
I would like to generate a nonlinear regression model to fit my experimental data 'Mk_Superf_FSF' as function of the independent variables 'MaxFDiam' and 'MinFDiam' which are respectively the max and min diameter of an arbirarily shaped closed and connected 2D surface. I also added the variable 'Area' which is obviously correlated to max and min diameters so I think it is not wise to use that as well.
I was suggested a linear fit for the experimental data (see attached picture). The 7th order polynomial p(x) fits the data very well but the suggested formula is non physical. In fact, the variable used is a sum of quantities with different units:
x = MaxFDiam * MinFDiam + Area / MaxFDiam + Area / MinFDiam + MaxFDiam / MinFDiam
I cannot assign a units to ithe resulting sum because the product of the two diameters has units [mm^2.] whereas the Area/MaxDIameter has units [mm], the ratio of the two diamters is unitless.
I tried to fit a sum of two negative exponentials where in the exponent I have the Area ad the product of the diameters respectively. MatLab complained printing out that the Jacobian has a column of all zeros. I tried some other combinations of exponential functions. Again MatLab complained stating that the model returns "NaN" of "Infinity".
Some other times MatLab printed out that that maximum number of iterations had been exceeded.
I tried a power-law fit as follows:
coeffs0 = [0.8672 1 1]
opts = statset('fitnlm');
opts.RobustWgtFcn = 'bisquare';
X = [MaxFDiam' MinFDiam'];
mdlfun = @(coeff, X) coeff(1)* X(:,1).*X(:,2).^coeff(2) + coeff(3);
mdl = fitnlm(X,Mk_Superf_FSF',mdlfun, coeffs0,'Options', opts, 'CoefficientNames', {'a' , 'b', 'c'});
This time MatLab did not complain but the resulting model is anything but good. The R^2 value is awful. The P_values are very high except for one.
mdl =
Nonlinear regression model:
y ~ a*x1*x2^b + c
Estimated Coefficients:
Estimate SE tStat pValue
________ ________ ________ __________
a 0.007407 0.02121 0.34922 0.73352
b -0.26657 0.87603 -0.30429 0.76658
c 0.93603 0.048648 19.241 8.0919e-10
Number of observations: 14, Error degrees of freedom: 11
Root Mean Squared Error: 0.0363
R-Squared: 0.215, Adjusted R-Squared 0.0721
F-statistic vs. constant model: 1.51, p-value = 0.264
Maybe the model is not right. Maybe the initial parameter values are not good.....
I would greatly appreciate some help at getting a decent fit. Above all, I would like to learn techniques to:
(1) devise the model formula
(2) choose the initial parameter values
Thank you so much for any suggestion and help.
Best regards,
Maura E. M.
14 Comments
dpb
on 23 Aug 2019
Ah! OK...I only opened and looked at the first one and presumed all the rest were the same...if weren't same number of header lines then I guess will need to to it again. Mayhaps that's why some seemed to have gaps in the perimeters...
Answers (1)
Jon
on 21 Aug 2019
Edited: dpb
on 23 Aug 2019
In your example you find a fit to a function of one variable, and are somehow looking for a combination of terms to form that one variable. Do you need to get it into this form or is it ok to have the predicted value, y be a function of two variables?
Assuming the latter, in case it is helpful I just tried a somewhat simplistic approach of considering the response to be a quadratic function of the two inputs MinFDiam and MaxFDiam.
Regarding motivation for choosing this form, I guess you could consider this to be a low order taylor series representation. (one up from linear which I tried and didn't fit very well). I'm not sure of the precise mathematical statemement of this, but the general notion is that for small enough regions all continuous functions are well approximated by just the low order terms of the Taylor series, and in particular functions that show some curvature are well approximated by quadratics in a small enough region.
I am not familiar with using fitnlm so I just used fitlm as follows
x1 = MinFDiam(:)
x2 = MaxFDiam(:)
y = Mk_Superf_FSF(:)
mdl = fitlm([x1 x2 x1.*x2 x1.^2 +x2.^2],y)
This gave the following statistics
mdl =
Linear regression model:
y ~ 1 + x1 + x2 + x3 + x4 + x5
Estimated Coefficients:
Estimate SE tStat pValue
___________ __________ ________ __________
(Intercept) 0.60854 0.075803 8.028 4.2585e-05
x1 0.057919 0.025395 2.2808 0.052008
x2 0.0015331 0.014779 0.10374 0.91993
x3 0.00067369 0.0014413 0.46741 0.65268
x4 -0.0024835 0.0012742 -1.9491 0.087118
x5 -0.00031494 0.00071727 -0.43908 0.67222
Number of observations: 14, Error degrees of freedom: 8
Root Mean Squared Error: 0.0204
R-squared: 0.82, Adjusted R-Squared: 0.708
F-statistic vs. constant model: 7.3, p-value = 0.00746
Which does not seem too bad.
16 Comments
dpb
on 24 Aug 2019
Edited: dpb
on 25 Aug 2019
"I think the[r]e is a physics explanation for the higher measurements."
And well may be but I think it highly unlikely that explanation is in the variables controlled/measured here.(*)
The MC simulation misses those specific points by far more than the others in a consistent direction so whatever it is isn't included in that model, either.
(*) And note that even if you were successful at building a model by some magic transformation of variables or nonlinear curve-fitting strategem that did manage to fit the observations from these measurements that to infer that would be the physical reason behind the values would be a gross misrepresentation of such a fit even if you could make it happen with a set of coefficients with consistent units.
See Also
Categories
Find more on Linear and Nonlinear Regression in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!