How to add two different linear fits to the same graph

I have tried looking for built-in plotting options to create a bi-linear fit for data but it appears this is something I probably have to code myself. Below I attached an example of the data I am trying to fit:
I want to be able to fit a linear regression to the 'flat' and 'steep' regions of this graph without having to manually create two data sets, regressions, and a combined plot. The point of this is to find the intersection point of the two linear regressions.
Perhaps the best way of solving the problem is to find where concavity/double derivative is maximized, however I really do not know how to go about coding this. Any ideas or help would be greatly appreciated.

4 Comments

I've posted this a couple times in the past but I don't seem to have saved a link...let me search for it and see if can find it rather than rewriting from scratch. I thought I had a utility function but don't see it now, either... :(
Wonderful, I think the double derivative idea does work (within specified limits) and I am trying to write a script out now. Thanks for the input!
%%Import text file as matrix (called m1 here)
%%For these files we need to first perform 1/c^2 on the 'c' column, in this case column four)
squares = 1./(m(1:,4)).^2;
%%Find double derivative
slopes = gradient(m1(:,4),squares);
concavity = gradient(m1:,4),slopes);
%%Find max/min concavity (min in this case)
%%Give specified range otherwise will find another part of the graph as max
max(concavity(30:40);
This seemed to work for some graphs but didn't work for others. I am very new to MATLAB so everything is probably clunky and I couldn't figure out how to export the script properly so I put it here. Hopefully it either inspires a better idea or you might be able to give a suggestion.
I have to specify this range based on where I am expecting the value to be, which is far from ideal. Maybe its just a flawed idea. Thanks again.
How about attach a .mat file with a couple of your curves to illustrate with? I did find my piecewise m-file (it was in an old installation folder).
I don't know how well the linear section is going to work with the RH section of your data; it looks more like a quadratic might be in order.

Sign in to comment.

 Accepted Answer

dpb
dpb on 27 Aug 2018
Edited: dpb on 29 Aug 2018
function y=piecewise(coef,x)
% return piecewise linear fit given [b1,m1,c,m2] as coefficient array and vector x
% c is breakpoint, b1 is intercept of first section, m1,m2 are two segment slopes
b1=coef(1); m1=coef(2); % reduce parentheses clutter....
c=coef(3); m2=coef(4);
ix=x>=c; % location > breakpoint
y(ix)=[b1+c*(m1-m2)+m2*x(ix)];
y(~ix)=[b1+m1*x(~ix)];
y=reshape(y,size(x));
end
Example use; contrived example but works well in general with real data in my experience...
x1=1:5; x2=x1+5; x=[x1 x2];
y=[polyval([1 1],x1) polyval([3 -10],x2)]+rand(size(x));
plot(x,y,'*')
hold on
b=nlinfit(x,y,@piecewise,[1 1 4 3])
b =
1.1746 1.0519 5.5482 3.1284
xh=linspace(x(1),x(end));
yh=piecewise(b,xh);
plot(xh,yh)
ADDENDUM
[dpb wrote earlier comment -- "I've never sat down and done the algebra to write as anything but two linear segments but it would be possible to do the same thing but join a straight line and a quadratic..."]
Well, it dawned on me it isn't difficult to do if one limits the implementation to the one specific case of the quadratic portion being arbitrarily set to one section or the other. Since your data have the nonlinear portion to the right of the breakpoint, I chose to do that case--
function y=piecelinquad(coef,x)
% return piecewise mixed linear/quadratic fit given [b1,m1,c,m2,m3] as
% coefficient array and vector x over which to evaluate function
% c is breakpoint, m1, b1 are slope, intercept of first section, and
% m1,m2 are second segment quadratic, linear coefficients
% The quadratic section as coded is always for x>c, linear for x<=c
%
% dp bozarth 26Aug2018
b1=coef(1); m1=coef(2); % reduce parentheses clutter....
c=coef(3); p=coef(4:5);
ix=x>c; % location > breakpoint
y(ix)=polyval([p b1+c*m1],x(ix)-c);
y(~ix)=b1+m1*x(~ix);
y=reshape(y,size(x));
end
Testing on a couple of artificial datasets such as the one used above, it seems to work well in locating the breakpoint and minimizing overall SSE; I noticed there may be a tendency to place the breakpoint somewhat farther inside the linear section as the overall error can be less with a little overshoot at the breakpoint by the quadratic. All these tests were with a small number of points, however, the more extensive datasets you have could be better behaved.
It raises the question, however, of whether, depending on the actual problem constraints of just using smoothing spline or similar?

4 Comments

Thank you so much for your time (I want to accept your answer right now before I drag you down a rabbit hole of my incessant questions). This certainly looks like a good result for your test data. I have a few questions though
  1. Where do I put the 'piecewise' function so I can call it like you did in the second part a.k.a @piecewise? (Right now it just tells me 'error using nlinfit model function piecewise not found'
  2. Why does the function have to take in both an array and a vector?
  3. If you use a rand function in the y data why does my output look the same as yours (other than the fit)?
dpb
dpb on 28 Aug 2018
Edited: dpb on 28 Aug 2018
The function has to be on MATLABPATH or in current working directory like any m-file.
The inputs to piecewise are the coefficients array and the independent X vector from which to compute the function for the calculation of the residual each iteration. nlinfit handles all those calculations internally; all it needs is the returned function values for each x at each iteration.
The rand function, x,y in the above are purely to build a sample dataset that's two linear lines with some noise thrown in so it isn't a perfect fit. Since rand -->[0, 1], the noise is small in comparison to the deterministic portion so the curves will look quite similar.
To use for your case, you call nlinfit passing in your actual data for the X vector for each curve and an initial estimate for the coefficients.
I got everything to work out. I will test it out on a few other data sets and see what happens. In addition, I'll see if the 'concavity idea' might work out as well. Again, thanks so much for your help.
I've never sat down and done the algebra to write as anything but two linear segments but it would be possible to do the same thing but join a straight line and a quadratic; what changes is then the second-segment intercept is dependent on that functional as well as function of the intersection.
Roughly locating the breakpoint would probably be very useful in determining an estimate for the breakpoint location starting value; typically ime it is pretty robust although if very far off one can get some warnings about ill-conditioned or rank deficiencies; generally these can be ignored as the routine ends up finding a valid solution and can be checked as above by plotting results and by re-running with the estimates as input guesses almost always the warnings will go away. Only if the data are not at all suited to the model and/or the initial guess is just totally out in left field will it fail with any datasets I've ever tried.

Sign in to comment.

More Answers (0)

Products

Release

R2018a

Asked:

on 27 Aug 2018

Edited:

dpb
on 29 Aug 2018

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!